Computational auditory scene analysis and its application to robot audition

Hiroshi G. Okuno, Kazuhiro Nakadai

研究成果: Conference contribution

9 被引用数 (Scopus)

抄録

Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called "HARK" (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which enables near real-time processing.

本文言語English
ホスト出版物のタイトル2008 Hands-free Speech Communication and Microphone Arrays, Proceedings, HSCMA 2008
ページ124-127
ページ数4
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 Hands-free Speech Communication and Microphone Arrays, HSCMA 2008 - Trento
継続期間: 2008 5月 62008 5月 8

Other

Other2008 Hands-free Speech Communication and Microphone Arrays, HSCMA 2008
CityTrento
Period08/5/608/5/8

ASJC Scopus subject areas

  • ハードウェアとアーキテクチャ
  • 電子工学および電気工学
  • 通信

フィンガープリント

「Computational auditory scene analysis and its application to robot audition」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル