Recognition of simultaneous speech by estimating reliability of separated signals for robot audition

Shun'ichi Yamamoto*, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean Marc Valin, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果

3 被引用数 (Scopus)

抄録

"Listening to several things at once" is a people's dream and one goal of AI and robot audition, because people can listen to at most two things at once according to psychophysical observations. Current noise reduction techniques cannot help to achieve this goal because they assume quasi-stationary noises, not interfering speech signals. Since robots are used in various environments, robot audition systems require minimum a priori information about their acoustic environments and speakers. We evaluate a missing feature theory approach that interfaces between sound source separation (SSS) and automatic speech recognition. The essential part is the estimate of reliability of each feature of separated sounds. We tested two kinds of robot audition systems that use SSS: independent component analysis (ICA) with two microphones, and geometric source separation (GSS) with eight microphones. For each SSS, automatic missing feature mask generation is developed. The recognition accuracy of two simultaneous speech improved to an average of 67.8 and 88.0% for ICA and GSS, respectively.

本文言語English
ホスト出版物のタイトルPRICAI 2006
ホスト出版物のサブタイトルTrends in Artificial Intelligence - 9th Pacific Rim International Conference on Artificial Intelligence, Proceedings
出版社Springer Verlag
ページ484-494
ページ数11
ISBN(印刷版)3540366679, 9783540366676
DOI
出版ステータスPublished - 2006
外部発表はい
イベント9th Pacific Rim International Conference on Artificial Intelligence - Guilin, China
継続期間: 2006 8 72006 8 11

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
4099 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference9th Pacific Rim International Conference on Artificial Intelligence
国/地域China
CityGuilin
Period06/8/706/8/11

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Recognition of simultaneous speech by estimating reliability of separated signals for robot audition」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル