Generalized weighted-prediction-error dereverberation with varying source priors for reverberant speech recognition

Toru Taniguchi, Aswin Shanmugam Subramanian, Xiaofei Wang, Dung Tran, Yuya Fujita, Shinji Watanabe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Weighted-prediction-error (WPE) is one of the well-known dereverberation signal processing methods especially for alleviating degradation of performance of automatic speech recognition (ASR) in a distant speaker scenario. WPE usually assumes that desired source signals always follow predefined specific source priors such as Gaussian with time-varying variances (TVG). Although based on this assumption WPE works well in practice, generally proper priors depend on sources, and they cannot be known in advance of the processing. On-demand estimation of source priors e.g. according to each utterance is thus required. For this purpose, we extend WPE by introducing a complex-valued generalized Gaussian (CGG) prior and its shape parameter estimator inside of processing to deal with a variety of super-Gaussian sources depending on sources. Blind estimation of the shape parameter of priors is realized by adding a shape parameter estimator as a sub-network to WPE-CGG, treated as a differentiable neural network. The sub-network can be trained by backpropagation from the outputs of the whole network using any criteria such as signal-level mean square error or even ASR errors if the WPE-CGG computational graph is connected to that of the ASR network. Experimental results show that the proposed method outperforms conventional baseline methods with the TVG prior without careful setting of the shape parameter value during evaluation.

Original languageEnglish
Title of host publication2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages293-297
Number of pages5
ISBN (Electronic)9781728111230
DOIs
Publication statusPublished - 2019 Oct
Externally publishedYes
Event2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019 - New Paltz, United States
Duration: 2019 Oct 202019 Oct 23

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Volume2019-October
ISSN (Print)1931-1168
ISSN (Electronic)1947-1629

Conference

Conference2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
CountryUnited States
CityNew Paltz
Period19/10/2019/10/23

Keywords

  • complex generalized Gaussian
  • reverberant speech recognition
  • shape parameter
  • Single-channel Dereverberation
  • WPE

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Generalized weighted-prediction-error dereverberation with varying source priors for reverberant speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Taniguchi, T., Subramanian, A. S., Wang, X., Tran, D., Fujita, Y., & Watanabe, S. (2019). Generalized weighted-prediction-error dereverberation with varying source priors for reverberant speech recognition. In 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019 (pp. 293-297). [8937270] (IEEE Workshop on Applications of Signal Processing to Audio and Acoustics; Vol. 2019-October). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WASPAA.2019.8937270