MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments

Masayuki Suzuki, Takuya Yoshioka, Shinji Watanabe, Nobuaki Minematsu, Keikichi Hirose

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

One of the most effective approaches to noise robust speech recognition is to remove the noise effect directly from corrupted MFCC vectors. However, VTS enhancement, which is a typical method for performing MFCC enhancement, provides limited improvement when the noise is highly non-stationary. This is because the VTS enhancement method cannot use a time-varying noise model to keep the computational cost at an acceptable level. This paper proposes a method that can enhance MFCC vectors and their dynamic parameters by using noise estimates that change on a frame-by-frame basis at a practical computational cost. The proposed method employs stereo data-based feature mapping like the well known SPLICE algorithm. The novelty of the proposed method lies in that it uses the joint space spanned by a concatenated vector of corrupted and noise features. It is also proposed to use linear discriminant analysis to effectively reduce the dimensionality of the joint space. The proposed method achieves 19.1% and 8.3% relative error reduction from the SPLICE and noise-mean normalized SPLICE algorithms, respectively.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages4109-4112
Number of pages4
DOIs
Publication statusPublished - 2012 Oct 23
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: 2012 Mar 252012 Mar 30

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
CountryJapan
CityKyoto
Period12/3/2512/3/30

    Fingerprint

Keywords

  • Noise robust ASR
  • SPLICE
  • non-stationary noise

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Suzuki, M., Yoshioka, T., Watanabe, S., Minematsu, N., & Hirose, K. (2012). MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments. In 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings (pp. 4109-4112). [6288822] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2012.6288822