Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition

Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper addresses a noise suppression problem, namely the estimation of non-stationary noise sequences. In this problem, we assume that non-stationary noise can be decomposed into stationary and non-stationary components. These components are described respectively as the bias factor and the residual signal between the bias component and noise at each frame. This decomposition clarifies the role of each component, thus enabling us to apply a suitable parameter estimation technique to each component. In this paper, the bias component is estimated by the EM algorithm with the entire observed signal sequence. On the other hand, the residual component is sequentially estimated by multiplying the extended Kalman filter with the EM algorithm. In the evaluation results, we confirmed that the proposed method improved speech recognition accuracy compared with the noise estimation methods without component decomposition.

Original languageEnglish
Title of host publication2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
Pages4816-4819
Number of pages4
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Prague
Duration: 2011 May 222011 May 27

Other

Other36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
CityPrague
Period11/5/2211/5/27

Fingerprint

Speech recognition
Acoustic noise
Decomposition
Extended Kalman filters
Parameter estimation
Protein Sorting Signals

Keywords

  • component decomposition
  • noise suppression
  • nonstationary noise
  • speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Fujimoto, M., Watanabe, S., & Nakatani, T. (2011). Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings (pp. 4816-4819). [5947433] https://doi.org/10.1109/ICASSP.2011.5947433

Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition. / Fujimoto, Masakiyo; Watanabe, Shinji; Nakatani, Tomohiro.

2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 4816-4819 5947433.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fujimoto, M, Watanabe, S & Nakatani, T 2011, Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition. in 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings., 5947433, pp. 4816-4819, 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, Prague, 11/5/22. https://doi.org/10.1109/ICASSP.2011.5947433
Fujimoto M, Watanabe S, Nakatani T. Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 4816-4819. 5947433 https://doi.org/10.1109/ICASSP.2011.5947433
Fujimoto, Masakiyo ; Watanabe, Shinji ; Nakatani, Tomohiro. / Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition. 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. pp. 4816-4819
@inproceedings{463ad0aa57a348308bf777aad76f5e96,
title = "Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition",
abstract = "This paper addresses a noise suppression problem, namely the estimation of non-stationary noise sequences. In this problem, we assume that non-stationary noise can be decomposed into stationary and non-stationary components. These components are described respectively as the bias factor and the residual signal between the bias component and noise at each frame. This decomposition clarifies the role of each component, thus enabling us to apply a suitable parameter estimation technique to each component. In this paper, the bias component is estimated by the EM algorithm with the entire observed signal sequence. On the other hand, the residual component is sequentially estimated by multiplying the extended Kalman filter with the EM algorithm. In the evaluation results, we confirmed that the proposed method improved speech recognition accuracy compared with the noise estimation methods without component decomposition.",
keywords = "component decomposition, noise suppression, nonstationary noise, speech recognition",
author = "Masakiyo Fujimoto and Shinji Watanabe and Tomohiro Nakatani",
year = "2011",
doi = "10.1109/ICASSP.2011.5947433",
language = "English",
isbn = "9781457705397",
pages = "4816--4819",
booktitle = "2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings",

}

TY - GEN

T1 - Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition

AU - Fujimoto, Masakiyo

AU - Watanabe, Shinji

AU - Nakatani, Tomohiro

PY - 2011

Y1 - 2011

N2 - This paper addresses a noise suppression problem, namely the estimation of non-stationary noise sequences. In this problem, we assume that non-stationary noise can be decomposed into stationary and non-stationary components. These components are described respectively as the bias factor and the residual signal between the bias component and noise at each frame. This decomposition clarifies the role of each component, thus enabling us to apply a suitable parameter estimation technique to each component. In this paper, the bias component is estimated by the EM algorithm with the entire observed signal sequence. On the other hand, the residual component is sequentially estimated by multiplying the extended Kalman filter with the EM algorithm. In the evaluation results, we confirmed that the proposed method improved speech recognition accuracy compared with the noise estimation methods without component decomposition.

AB - This paper addresses a noise suppression problem, namely the estimation of non-stationary noise sequences. In this problem, we assume that non-stationary noise can be decomposed into stationary and non-stationary components. These components are described respectively as the bias factor and the residual signal between the bias component and noise at each frame. This decomposition clarifies the role of each component, thus enabling us to apply a suitable parameter estimation technique to each component. In this paper, the bias component is estimated by the EM algorithm with the entire observed signal sequence. On the other hand, the residual component is sequentially estimated by multiplying the extended Kalman filter with the EM algorithm. In the evaluation results, we confirmed that the proposed method improved speech recognition accuracy compared with the noise estimation methods without component decomposition.

KW - component decomposition

KW - noise suppression

KW - nonstationary noise

KW - speech recognition

UR - http://www.scopus.com/inward/record.url?scp=80051616431&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051616431&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2011.5947433

DO - 10.1109/ICASSP.2011.5947433

M3 - Conference contribution

AN - SCOPUS:80051616431

SN - 9781457705397

SP - 4816

EP - 4819

BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings

ER -