抄録
Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called "HARK", which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which provides real-time processing. HARK's performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.
本文言語 | English |
---|---|
ホスト出版物のタイトル | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 |
ページ | 561-566 |
ページ数 | 6 |
DOI | |
出版ステータス | Published - 2008 |
外部発表 | はい |
イベント | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 - Daejeon 継続期間: 2008 12月 1 → 2008 12月 3 |
Other
Other | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 |
---|---|
City | Daejeon |
Period | 08/12/1 → 08/12/3 |
ASJC Scopus subject areas
- 人工知能
- コンピュータ ビジョンおよびパターン認識
- 人間とコンピュータの相互作用