Missing-feature-theory-based robust simultaneous speech recognition system with non-clean speech acoustic model

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

研究成果: Conference contribution

3 被引用数 (Scopus)

抄録

A humanoid robot must recognize a target speech signal while people around the robot chat with them in real-world. To recognize the target speech signal, robot has to separate the target speech signal among other speech signals and recognize the separated speech signal. As separated signal includes distortion, automatic speech recognition (ASR) performance degrades. To avoid the degradation, we trained an acoustic model from non-clean speech signals to adapt acoustic feature of distorted signal and adding white noise to separated speech signal before extracting acoustic feature. The issues are (1) To determine optimal noise level to add the training speech signals, and (2) To determine optimal noise level to add the separated signal. In this paper, we investigate how much noises should be added to clean speech data for training and how speech recognition performance improves for different positions of three talkers with soft masking. Experimental results show that the best performance is obtained by adding white noises of 30 dB. The ASR with the acoustic model outperforms with ASR with the clean acoustic model by 4 points.

本文言語English
ホスト出版物のタイトル2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
ページ2730-2735
ページ数6
DOI
出版ステータスPublished - 2009 12 11
外部発表はい
イベント2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009 - St. Louis, MO, United States
継続期間: 2009 10 112009 10 15

出版物シリーズ

名前2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009

Conference

Conference2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
国/地域United States
CitySt. Louis, MO
Period09/10/1109/10/15

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ ビジョンおよびパターン認識
  • 人間とコンピュータの相互作用
  • 制御およびシステム工学

フィンガープリント

「Missing-feature-theory-based robust simultaneous speech recognition system with non-clean speech acoustic model」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル