Nonparametric bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes

Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno

    Research output: Contribution to journalArticle

    5 Citations (Scopus)

    Abstract

    This paper describes a monaural audio dereverberation method that operates in the power spectrogram domain. The method is robust to different kinds of source signals such as speech or music. Moreover, it requires little manual intervention, including the complexity of room acoustics. The method is based on a non-conjugate Bayesian model of the power spectrogram. It extends the idea of multi-channel linear prediction to the power spectrogram domain, and formulates a model of reverberation as a non-negative, infinite-order autoregressive process. To this end, the power spectrogram is interpreted as a histogram count data, which allows a nonparametric Bayesian model to be used as the prior for the autoregressive process, allowing the effective number of active components to grow, without bound, with the complexity of data. In order to determine the marginal posterior distribution, a convergent algorithm, inspired by the variational Bayes method, is formulated. It employs the minorization-maximization technique to arrive at an iterative, convergent algorithm that approximates the marginal posterior distribution. Both objective and subjective evaluations show advantage over other methods based on the power spectrum.We also apply the method to amusic information retrieval task and demonstrate its effectiveness.

    Original languageEnglish
    Article number6894190
    Pages (from-to)1918-1930
    Number of pages13
    JournalIEEE/ACM Transactions on Speech and Language Processing
    Volume22
    Issue number12
    DOIs
    Publication statusPublished - 2014 Dec 1

    Fingerprint

    autoregressive processes
    Spectrogram
    Bayesian Nonparametrics
    spectrograms
    Autoregressive Process
    Bayesian Model
    Marginal Distribution
    Posterior distribution
    Reverberation
    Power spectrum
    information retrieval
    Information retrieval
    linear prediction
    Variational Bayes
    Bayes Method
    music
    reverberation
    Acoustics
    Linear Prediction
    Subjective Evaluation

    Keywords

    • Dereverberation
    • Minorization maximization
    • Nonparameteric Bayes

    ASJC Scopus subject areas

    • Signal Processing
    • Electrical and Electronic Engineering
    • Media Technology
    • Acoustics and Ultrasonics
    • Instrumentation
    • Linguistics and Language
    • Speech and Hearing

    Cite this

    Nonparametric bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes. / Maezawa, Akira; Itoyama, Katsutoshi; Yoshii, Kazuyoshi; Okuno, Hiroshi G.

    In: IEEE/ACM Transactions on Speech and Language Processing, Vol. 22, No. 12, 6894190, 01.12.2014, p. 1918-1930.

    Research output: Contribution to journalArticle

    @article{18dce43d298049c3ac74292d2b9a8fc9,
    title = "Nonparametric bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes",
    abstract = "This paper describes a monaural audio dereverberation method that operates in the power spectrogram domain. The method is robust to different kinds of source signals such as speech or music. Moreover, it requires little manual intervention, including the complexity of room acoustics. The method is based on a non-conjugate Bayesian model of the power spectrogram. It extends the idea of multi-channel linear prediction to the power spectrogram domain, and formulates a model of reverberation as a non-negative, infinite-order autoregressive process. To this end, the power spectrogram is interpreted as a histogram count data, which allows a nonparametric Bayesian model to be used as the prior for the autoregressive process, allowing the effective number of active components to grow, without bound, with the complexity of data. In order to determine the marginal posterior distribution, a convergent algorithm, inspired by the variational Bayes method, is formulated. It employs the minorization-maximization technique to arrive at an iterative, convergent algorithm that approximates the marginal posterior distribution. Both objective and subjective evaluations show advantage over other methods based on the power spectrum.We also apply the method to amusic information retrieval task and demonstrate its effectiveness.",
    keywords = "Dereverberation, Minorization maximization, Nonparameteric Bayes",
    author = "Akira Maezawa and Katsutoshi Itoyama and Kazuyoshi Yoshii and Okuno, {Hiroshi G.}",
    year = "2014",
    month = "12",
    day = "1",
    doi = "10.1109/TASLP.2014.2355772",
    language = "English",
    volume = "22",
    pages = "1918--1930",
    journal = "IEEE/ACM Transactions on Speech and Language Processing",
    issn = "2329-9290",
    publisher = "IEEE Advancing Technology for Humanity",
    number = "12",

    }

    TY - JOUR

    T1 - Nonparametric bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes

    AU - Maezawa, Akira

    AU - Itoyama, Katsutoshi

    AU - Yoshii, Kazuyoshi

    AU - Okuno, Hiroshi G.

    PY - 2014/12/1

    Y1 - 2014/12/1

    N2 - This paper describes a monaural audio dereverberation method that operates in the power spectrogram domain. The method is robust to different kinds of source signals such as speech or music. Moreover, it requires little manual intervention, including the complexity of room acoustics. The method is based on a non-conjugate Bayesian model of the power spectrogram. It extends the idea of multi-channel linear prediction to the power spectrogram domain, and formulates a model of reverberation as a non-negative, infinite-order autoregressive process. To this end, the power spectrogram is interpreted as a histogram count data, which allows a nonparametric Bayesian model to be used as the prior for the autoregressive process, allowing the effective number of active components to grow, without bound, with the complexity of data. In order to determine the marginal posterior distribution, a convergent algorithm, inspired by the variational Bayes method, is formulated. It employs the minorization-maximization technique to arrive at an iterative, convergent algorithm that approximates the marginal posterior distribution. Both objective and subjective evaluations show advantage over other methods based on the power spectrum.We also apply the method to amusic information retrieval task and demonstrate its effectiveness.

    AB - This paper describes a monaural audio dereverberation method that operates in the power spectrogram domain. The method is robust to different kinds of source signals such as speech or music. Moreover, it requires little manual intervention, including the complexity of room acoustics. The method is based on a non-conjugate Bayesian model of the power spectrogram. It extends the idea of multi-channel linear prediction to the power spectrogram domain, and formulates a model of reverberation as a non-negative, infinite-order autoregressive process. To this end, the power spectrogram is interpreted as a histogram count data, which allows a nonparametric Bayesian model to be used as the prior for the autoregressive process, allowing the effective number of active components to grow, without bound, with the complexity of data. In order to determine the marginal posterior distribution, a convergent algorithm, inspired by the variational Bayes method, is formulated. It employs the minorization-maximization technique to arrive at an iterative, convergent algorithm that approximates the marginal posterior distribution. Both objective and subjective evaluations show advantage over other methods based on the power spectrum.We also apply the method to amusic information retrieval task and demonstrate its effectiveness.

    KW - Dereverberation

    KW - Minorization maximization

    KW - Nonparameteric Bayes

    UR - http://www.scopus.com/inward/record.url?scp=84921764278&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84921764278&partnerID=8YFLogxK

    U2 - 10.1109/TASLP.2014.2355772

    DO - 10.1109/TASLP.2014.2355772

    M3 - Article

    AN - SCOPUS:84921764278

    VL - 22

    SP - 1918

    EP - 1930

    JO - IEEE/ACM Transactions on Speech and Language Processing

    JF - IEEE/ACM Transactions on Speech and Language Processing

    SN - 2329-9290

    IS - 12

    M1 - 6894190

    ER -