Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears

Ryu Takeda*, Shun'ichi Yamamoto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

11 被引用数 (Scopus)

抄録

Robot audition is a critical technology in making robots symbiosis with people. Since we hear a mixture of sounds in our daily lives, sound source localization and separation, and recognition of separated sounds are three essential capabilities. Sound source localization has been recently studied well for robots, while the other capabilities still need extensive studies. This paper reports the robot audition system with a pair of omni-directional microphones embedded in a humanoid to recognize two simultaneous talkers. It first separates sound sources by Independent Component Analysis (ICA) with single-input multiple-output (SIMO) model. Then, spectral distortion for separated sounds is estimated to identify reliable and unreliable components of the spectrogram. This estimation generates the missing feature masks as spectrographic masks. These masks are then used to avoid influences caused by spectral distortion in automatic speech recognition based on missing-feature method. The novel ideas of our system reside in estimates of spectral distortion of temporal-frequency domain in terms of feature vectors. In addition, we point out that the voice-activity detection (VAD) is effective to overcome the weak point of ICA against the changing number of talkers. The resulting system outperformed the baseline robot audition system by 15 %.

本文言語English
ホスト出版物のタイトル2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
ページ878-885
ページ数8
DOI
出版ステータスPublished - 2006 12 1
外部発表はい
イベント2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006 - Beijing, China
継続期間: 2006 10 92006 10 15

出版物シリーズ

名前IEEE International Conference on Intelligent Robots and Systems

Conference

Conference2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
国/地域China
CityBeijing
Period06/10/906/10/15

ASJC Scopus subject areas

  • 制御およびシステム工学
  • ソフトウェア
  • コンピュータ ビジョンおよびパターン認識
  • コンピュータ サイエンスの応用

フィンガープリント

「Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル