Missing feature speech recognition in a meeting situation with maximum SNR beamforming

Dorothea Kolossa*, Shoko Araki, Marc Delcroix, Tomohiro Nakatani, Reinhold Orglmeister, Shoji Makino

*この研究の対応する著者

研究成果: Conference contribution

7 被引用数 (Scopus)

抄録

Especially for tasks like automatic meeting transcription, it would be useful to automatically recognize speech also while multiple speakers are talking simultaneously. For this purpose, speech separation can be performed, for example by using maximum SNR beamforming. However, even when good interferer suppression is attained, the interfering speech will still be recognizable during those intervals, where the target speaker is silent. In order to avoid the consequential insertion errors, a new soft masking scheme is proposed, which works in the time domain by inducing a large damping on those temporal periods, where the observed direction of arrival does not correspond to that of the target speaker. Even though the masking scheme is aggressive, by means of missing feature recognition the recognition accuracy can be improved significantly, with relative error reductions in the order of 60% compared to maximum SNR beamforming alone, and it is successful also for three simultaneously active speakers. Results are reported based on the SOLON speech recognizer, NTT's large vocabulary system [1], which is applied here for the recognition of artificially mixed data using real-room impulse responses and the entire clean test set of the Aurora 2 database.

本文言語English
ホスト出版物のタイトル2008 IEEE International Symposium on Circuits and Systems, ISCAS 2008
ページ3218-3221
ページ数4
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 IEEE International Symposium on Circuits and Systems, ISCAS 2008 - Seattle, WA, United States
継続期間: 2008 5月 182008 5月 21

出版物シリーズ

名前Proceedings - IEEE International Symposium on Circuits and Systems
ISSN(印刷版)0271-4310

Conference

Conference2008 IEEE International Symposium on Circuits and Systems, ISCAS 2008
国/地域United States
CitySeattle, WA
Period08/5/1808/5/21

ASJC Scopus subject areas

  • 電子工学および電気工学

フィンガープリント

「Missing feature speech recognition in a meeting situation with maximum SNR beamforming」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル