Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation

Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages4713-4716
Number of pages4
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto
Duration: 2012 Mar 252012 Mar 30

Other

Other2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
CityKyoto
Period12/3/2512/3/30

Fingerprint

Taylor series
Processing

Keywords

  • MMSE estimation
  • noise mixture model
  • noise suppression
  • speaker adaptation

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Fujimoto, M., Watanabe, S., & Nakatani, T. (2012). Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation. In 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings (pp. 4713-4716). [6288971] https://doi.org/10.1109/ICASSP.2012.6288971

Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation. / Fujimoto, Masakiyo; Watanabe, Shinji; Nakatani, Tomohiro.

2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. p. 4713-4716 6288971.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fujimoto, M, Watanabe, S & Nakatani, T 2012, Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation. in 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings., 6288971, pp. 4713-4716, 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012, Kyoto, 12/3/25. https://doi.org/10.1109/ICASSP.2012.6288971
Fujimoto M, Watanabe S, Nakatani T. Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation. In 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. p. 4713-4716. 6288971 https://doi.org/10.1109/ICASSP.2012.6288971
Fujimoto, Masakiyo ; Watanabe, Shinji ; Nakatani, Tomohiro. / Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation. 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. pp. 4713-4716
@inproceedings{aabff5613842498bae0e9e626cac594d,
title = "Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation",
abstract = "The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.",
keywords = "MMSE estimation, noise mixture model, noise suppression, speaker adaptation",
author = "Masakiyo Fujimoto and Shinji Watanabe and Tomohiro Nakatani",
year = "2012",
doi = "10.1109/ICASSP.2012.6288971",
language = "English",
isbn = "9781467300469",
pages = "4713--4716",
booktitle = "2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings",

}

TY - GEN

T1 - Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation

AU - Fujimoto, Masakiyo

AU - Watanabe, Shinji

AU - Nakatani, Tomohiro

PY - 2012

Y1 - 2012

N2 - The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.

AB - The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.

KW - MMSE estimation

KW - noise mixture model

KW - noise suppression

KW - speaker adaptation

UR - http://www.scopus.com/inward/record.url?scp=84867606947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867606947&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2012.6288971

DO - 10.1109/ICASSP.2012.6288971

M3 - Conference contribution

SN - 9781467300469

SP - 4713

EP - 4716

BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings

ER -