Neural network with unbounded activation functions is universal approximator

Sho Sonoda, Noboru Murata

    Research output: Contribution to journalArticle

    20 Citations (Scopus)

    Abstract

    This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning. The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions. By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property. As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation. Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation. Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.

    Original languageEnglish
    JournalApplied and Computational Harmonic Analysis
    DOIs
    Publication statusAccepted/In press - 2015 May 14

    Fingerprint

    Ridgelets
    Activation Function
    Radon
    Backpropagation
    Chemical activation
    Admissibility
    Back Propagation
    Approximation Property
    Neural Networks
    Transform
    Neural networks
    Universal Approximation
    Unit
    Radon Transform
    Slice
    Filtering
    Filter
    Imply
    Numerical Examples
    Theorem

    Keywords

    • Admissibility condition
    • Backprojection filter
    • Bounded extension to L2
    • Integral representation
    • Lizorkin distribution
    • Neural network
    • Radon transform
    • Rectified linear unit (ReLU)
    • Ridgelet transform
    • Universal approximation

    ASJC Scopus subject areas

    • Applied Mathematics

    Cite this

    @article{9b581e82652749d88f7bb07db311a31d,
    title = "Neural network with unbounded activation functions is universal approximator",
    abstract = "This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning. The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions. By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property. As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation. Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation. Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.",
    keywords = "Admissibility condition, Backprojection filter, Bounded extension to L2, Integral representation, Lizorkin distribution, Neural network, Radon transform, Rectified linear unit (ReLU), Ridgelet transform, Universal approximation",
    author = "Sho Sonoda and Noboru Murata",
    year = "2015",
    month = "5",
    day = "14",
    doi = "10.1016/j.acha.2015.12.005",
    language = "English",
    journal = "Applied and Computational Harmonic Analysis",
    issn = "1063-5203",
    publisher = "Academic Press Inc.",

    }

    TY - JOUR

    T1 - Neural network with unbounded activation functions is universal approximator

    AU - Sonoda, Sho

    AU - Murata, Noboru

    PY - 2015/5/14

    Y1 - 2015/5/14

    N2 - This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning. The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions. By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property. As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation. Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation. Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.

    AB - This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning. The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions. By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property. As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation. Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation. Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.

    KW - Admissibility condition

    KW - Backprojection filter

    KW - Bounded extension to L2

    KW - Integral representation

    KW - Lizorkin distribution

    KW - Neural network

    KW - Radon transform

    KW - Rectified linear unit (ReLU)

    KW - Ridgelet transform

    KW - Universal approximation

    UR - http://www.scopus.com/inward/record.url?scp=84960887035&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84960887035&partnerID=8YFLogxK

    U2 - 10.1016/j.acha.2015.12.005

    DO - 10.1016/j.acha.2015.12.005

    M3 - Article

    JO - Applied and Computational Harmonic Analysis

    JF - Applied and Computational Harmonic Analysis

    SN - 1063-5203

    ER -