TY - GEN
T1 - Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation
AU - Fujimoto, Masakiyo
AU - Watanabe, Shinji
AU - Nakatani, Tomohiro
PY - 2012
Y1 - 2012
N2 - The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.
AB - The estimation of an accurate noise model is a crucial problem for model-based noise suppression including a vector Taylor series (VTS)-based approach. The variation of the speaker characteristics is also a crucial factor as regards the model-based noise suppression. As a result, a speaker adaptation technique plays an important role in the model-based noise suppression. To deal with former problem, we have already proposed an unsupervised estimation method for a noise mixture model. Therefore, this paper proposes a joint processing method that simultaneously achieves speaker adaptation and noise mixture model estimation. This joint processing is realized by using minimum mean squared error (MMSE) estimates of clean speech and noise. Although VTS-based approach involves nonlinear transformation, the MMSE estimates make it possible to flexibly estimate accurate parameters for the joint processing without the influences of non-linear VTS transformation. In the evaluation, the proposed method provided an improvement compared with results obtained using only noise mixture model estimation.
KW - MMSE estimation
KW - noise mixture model
KW - noise suppression
KW - speaker adaptation
UR - http://www.scopus.com/inward/record.url?scp=84867606947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867606947&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2012.6288971
DO - 10.1109/ICASSP.2012.6288971
M3 - Conference contribution
AN - SCOPUS:84867606947
SN - 9781467300469
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4713
EP - 4716
BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
T2 - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Y2 - 25 March 2012 through 30 March 2012
ER -