Exploiting harmonic structures to improve separating simultaneous speech in under-determined conditions

Yasuharu Hirasawa*, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果

抄録

In real-world situations, a robot may often encounter "under- determined" situation, where there are more sound sources than microphones. This paper presents a speech separation method using a new constraint on the harmonic structure for a simultaneous speech-recognition system in under-determined conditions. The requirements for a speech separation method in a simultaneous speech-recognition system are (1) ability to handle a large number of talkers, and (2) reduction of distortion in acoustic features. Conventional methods use a maximum likelihood estimation in sound source separation, which fulfills requirement (1). Since it is a general approach, the performance is limited when separating speech. This paper presents a two-stage method to improve the separation. The first stage uses maximum likelihood estimation and extracts the harmonic structure, and the second stage exploits the harmonic structure as a new constraint to achieve requirement (2). We carried out an experiment that simulated three simultaneous utterances using impulse responses recorded by two microphones in an anechoic chamber. The experimental results revealed that our method could improve speech recognition correctness by about four points.

本文言語English
ホスト出版物のタイトルIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings
ページ450-457
ページ数8
DOI
出版ステータスPublished - 2010
外部発表はい
イベント23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Taipei, Taiwan, Province of China
継続期間: 2010 10 182010 10 22

出版物シリーズ

名前IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings

Conference

Conference23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010
国/地域Taiwan, Province of China
CityTaipei
Period10/10/1810/10/22

ASJC Scopus subject areas

  • 人工知能
  • 人間とコンピュータの相互作用
  • 制御およびシステム工学

フィンガープリント

「Exploiting harmonic structures to improve separating simultaneous speech in under-determined conditions」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル