An open source software system for robot audition HARK and its evaluation

Kazuhiro Nakadai*, Hiroshi G. Okuno, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino

*この研究の対応する著者

研究成果: Conference contribution

53 被引用数 (Scopus)

抄録

Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called "HARK", which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which provides real-time processing. HARK's performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.

本文言語English
ホスト出版物のタイトル2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008
ページ561-566
ページ数6
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 - Daejeon
継続期間: 2008 12 12008 12 3

Other

Other2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008
CityDaejeon
Period08/12/108/12/3

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ ビジョンおよびパターン認識
  • 人間とコンピュータの相互作用

フィンガープリント

「An open source software system for robot audition HARK and its evaluation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル