Assessment of general applicability of robot audition system by recognizing three simultaneous speeches

Shun'ichi Yamamoto, Kazuhiro Nakadai, Hiroshi Tsujino, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

Robot audition is a critical technology in creating an intelligent robot operating in daily environments. We have developed such a robot audition system by using a new interface between sound source separation and automatic speech recognition (ASR). A mixture of speeches captured with a pair of microphones installed in the ear positions of a humanoid is separated into each speech by using active direction-pass filter (ADPF). The ADPF extracts a sound source originating from a specific direction in real-time by using interaural phase and intensity differences. The separated speech is recognized by a speech recognizer based on the missing feature theory (MFT). By using a missing feature mask, the MFT based ASR neglects distorted and missing features caused during the speech separation. A missing feature mask for each separated speech is generated in speech separation and is sent to the ASR with the separated speech. Thus, this new integration improves the performance of ASR. However, the generality of this robot audition system has not been assessed so far. In this paper, we assess its general applicability by implementing it on the three humanoids, i.e., ASIMO of Honda, SIG2, and Replie of Kyoto University. By using three simultaneous speeches as benchmarks, the robot audition system improved the performance of ASR over 50% in every humanoid, and thus its general applicability was confirmed.

Original languageEnglish
Title of host publication2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Pages2111-2116
Number of pages6
Volume3
Publication statusPublished - 2004
Externally publishedYes
Event2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Sendai
Duration: 2004 Sep 282004 Oct 2

Other

Other2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
CitySendai
Period04/9/2804/10/2

    Fingerprint

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Yamamoto, S., Nakadai, K., Tsujino, H., & Okuno, H. G. (2004). Assessment of general applicability of robot audition system by recognizing three simultaneous speeches. In 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (Vol. 3, pp. 2111-2116)