TY - GEN
T1 - On-line adaptation and Bayesian detection of environmental changes based on a macroscopic time evolution system
AU - Watanabe, Shinji
AU - Nakamura, Atsushi
PY - 2009
Y1 - 2009
N2 - Acoustic characteristics are often changed over time as a result of various factors including changes of speakers, speaking styles, and noise sources. Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models quickly and stably to such time-variant acoustic characteristics. Recently we proposed a novel incremental adaptation framework based on a macroscopic time evolution system, which models the time-variant characteristics by successively updating posterior distributions of acoustic model parameters. This paper proposes fast incremental adaptation based on a macroscopic time evolution system that realizes an utterance-by-utterance update by approximating the posterior distributions. This adaptation was used to perform on-line adaptation of Japanese broadcast news for very large vocabulary continuous speech recognition (700k vocabulary size) in real time. The word accuracy was improved from 73.9% to 85.1%. In addition, by incorporating a Bayesian model selection approach, we realized the simultaneous on-line adaptation and detection of environmental changes.
AB - Acoustic characteristics are often changed over time as a result of various factors including changes of speakers, speaking styles, and noise sources. Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models quickly and stably to such time-variant acoustic characteristics. Recently we proposed a novel incremental adaptation framework based on a macroscopic time evolution system, which models the time-variant characteristics by successively updating posterior distributions of acoustic model parameters. This paper proposes fast incremental adaptation based on a macroscopic time evolution system that realizes an utterance-by-utterance update by approximating the posterior distributions. This adaptation was used to perform on-line adaptation of Japanese broadcast news for very large vocabulary continuous speech recognition (700k vocabulary size) in real time. The word accuracy was improved from 73.9% to 85.1%. In addition, by incorporating a Bayesian model selection approach, we realized the simultaneous on-line adaptation and detection of environmental changes.
KW - Acoustic model
KW - Macroscopic time evolution system
KW - Model selection
KW - On-line adaptation
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=70349213985&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349213985&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2009.4960598
DO - 10.1109/ICASSP.2009.4960598
M3 - Conference contribution
AN - SCOPUS:70349213985
SN - 9781424423545
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4373
EP - 4376
BT - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009
T2 - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
Y2 - 19 April 2009 through 24 April 2009
ER -