Statistical models for speech dereverberation

Takuya Yoshioka, Hirokazu Kameoka, Tomohiro Nakatani, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper discusses a statistical-model-based approach to speech dereverberation. With this approach, we first define parametric statistical models of probability density functions (pdfs) for a clean speech signal and a room transmission channel, then estimate the model parameters, and finally recover the clean speech signal by using the pdfs with the estimated parameter values. The key to the success of this approach lies in the definition of the models of the clean speech signal and room transmission channel pdfs. This paper presents several statistical models (including newly proposed ones) and compares them in a large-scale experiment. As regards the room transmission channel pdf, an autoregressive (AR) model, an autoregressive power spectral density (ARPSD) model, and a moving-average power spectral density (MAPSD) model are considered. A clean speech signal pdf model is selected according to the room transmission channel pdf model. The AR model exhibited the highest dereverberation accuracy when a reverberant speech signal of 2 sec or longer was available while the other two models outperformed the AR model when only a l-sec reverberant speech signal was available.

Original languageEnglish
Title of host publicationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Pages145-148
Number of pages4
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009 - New Paltz, NY
Duration: 2009 Oct 182009 Oct 21

Other

Other2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009
CityNew Paltz, NY
Period09/10/1809/10/21

Fingerprint

Probability density function
Power spectral density
Statistical Models
Experiments

Keywords

  • Dereverberation
  • Statistical model

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Cite this

Yoshioka, T., Kameoka, H., Nakatani, T., & Okuno, H. G. (2009). Statistical models for speech dereverberation. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 145-148). [5346489] https://doi.org/10.1109/ASPAA.2009.5346489

Statistical models for speech dereverberation. / Yoshioka, Takuya; Kameoka, Hirokazu; Nakatani, Tomohiro; Okuno, Hiroshi G.

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. p. 145-148 5346489.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yoshioka, T, Kameoka, H, Nakatani, T & Okuno, HG 2009, Statistical models for speech dereverberation. in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics., 5346489, pp. 145-148, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009, New Paltz, NY, 09/10/18. https://doi.org/10.1109/ASPAA.2009.5346489
Yoshioka T, Kameoka H, Nakatani T, Okuno HG. Statistical models for speech dereverberation. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. p. 145-148. 5346489 https://doi.org/10.1109/ASPAA.2009.5346489
Yoshioka, Takuya ; Kameoka, Hirokazu ; Nakatani, Tomohiro ; Okuno, Hiroshi G. / Statistical models for speech dereverberation. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. pp. 145-148
@inproceedings{21fec434ebc1486b9ea3d337df11cb96,
title = "Statistical models for speech dereverberation",
abstract = "This paper discusses a statistical-model-based approach to speech dereverberation. With this approach, we first define parametric statistical models of probability density functions (pdfs) for a clean speech signal and a room transmission channel, then estimate the model parameters, and finally recover the clean speech signal by using the pdfs with the estimated parameter values. The key to the success of this approach lies in the definition of the models of the clean speech signal and room transmission channel pdfs. This paper presents several statistical models (including newly proposed ones) and compares them in a large-scale experiment. As regards the room transmission channel pdf, an autoregressive (AR) model, an autoregressive power spectral density (ARPSD) model, and a moving-average power spectral density (MAPSD) model are considered. A clean speech signal pdf model is selected according to the room transmission channel pdf model. The AR model exhibited the highest dereverberation accuracy when a reverberant speech signal of 2 sec or longer was available while the other two models outperformed the AR model when only a l-sec reverberant speech signal was available.",
keywords = "Dereverberation, Statistical model",
author = "Takuya Yoshioka and Hirokazu Kameoka and Tomohiro Nakatani and Okuno, {Hiroshi G.}",
year = "2009",
doi = "10.1109/ASPAA.2009.5346489",
language = "English",
isbn = "9781424436798",
pages = "145--148",
booktitle = "IEEE Workshop on Applications of Signal Processing to Audio and Acoustics",

}

TY - GEN

T1 - Statistical models for speech dereverberation

AU - Yoshioka, Takuya

AU - Kameoka, Hirokazu

AU - Nakatani, Tomohiro

AU - Okuno, Hiroshi G.

PY - 2009

Y1 - 2009

N2 - This paper discusses a statistical-model-based approach to speech dereverberation. With this approach, we first define parametric statistical models of probability density functions (pdfs) for a clean speech signal and a room transmission channel, then estimate the model parameters, and finally recover the clean speech signal by using the pdfs with the estimated parameter values. The key to the success of this approach lies in the definition of the models of the clean speech signal and room transmission channel pdfs. This paper presents several statistical models (including newly proposed ones) and compares them in a large-scale experiment. As regards the room transmission channel pdf, an autoregressive (AR) model, an autoregressive power spectral density (ARPSD) model, and a moving-average power spectral density (MAPSD) model are considered. A clean speech signal pdf model is selected according to the room transmission channel pdf model. The AR model exhibited the highest dereverberation accuracy when a reverberant speech signal of 2 sec or longer was available while the other two models outperformed the AR model when only a l-sec reverberant speech signal was available.

AB - This paper discusses a statistical-model-based approach to speech dereverberation. With this approach, we first define parametric statistical models of probability density functions (pdfs) for a clean speech signal and a room transmission channel, then estimate the model parameters, and finally recover the clean speech signal by using the pdfs with the estimated parameter values. The key to the success of this approach lies in the definition of the models of the clean speech signal and room transmission channel pdfs. This paper presents several statistical models (including newly proposed ones) and compares them in a large-scale experiment. As regards the room transmission channel pdf, an autoregressive (AR) model, an autoregressive power spectral density (ARPSD) model, and a moving-average power spectral density (MAPSD) model are considered. A clean speech signal pdf model is selected according to the room transmission channel pdf model. The AR model exhibited the highest dereverberation accuracy when a reverberant speech signal of 2 sec or longer was available while the other two models outperformed the AR model when only a l-sec reverberant speech signal was available.

KW - Dereverberation

KW - Statistical model

UR - http://www.scopus.com/inward/record.url?scp=77950107032&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950107032&partnerID=8YFLogxK

U2 - 10.1109/ASPAA.2009.5346489

DO - 10.1109/ASPAA.2009.5346489

M3 - Conference contribution

SN - 9781424436798

SP - 145

EP - 148

BT - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

ER -