Enhanced robot speech recognition based on microphone array source separation and missing feature theory

Shun'ichi Yamamoto, Jean Marc Valin, Kazuhiro Nakadai, Jean Rouat, François Michaud, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

50 Citations (Scopus)

Abstract

A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and recognition of separated sounds. While the first two are frequently addressed, the last one has not been studied so much. We present a system that gives a humanoid robot the ability to localize, separate and recognize simultaneous sound sources. A microphone array is used along with a real-time dedicated implementation of Geometric Source Separation (GSS) and a multi-channel post-filter that gives us a further reduction of interferences from other sources. An automatic speech recognizer (ASR) based on the Missing Feature Theory (MFT) recognizes separated sounds in real-time by generating missing feature masks automatically from the post-filtering step. The main advantage of this approach for humanoid robots resides in the fact that the ASR with a clean acoustic model can adapt the distortion of separated sound by consulting the post-filter feature masks. Recognition rates are presented for three simultaneous speakers located at 2m from the robot. Use of both the post-filter and the missing feature mask results in an average reduction in error rate of 42% (relative).

Original languageEnglish
Title of host publicationProceedings of the 2005 IEEE International Conference on Robotics and Automation
Pages1477-1482
Number of pages6
DOIs
Publication statusPublished - 2005 Dec 1
Event2005 IEEE International Conference on Robotics and Automation - Barcelona, Spain
Duration: 2005 Apr 182005 Apr 22

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2005
ISSN (Print)1050-4729

Conference

Conference2005 IEEE International Conference on Robotics and Automation
CountrySpain
CityBarcelona
Period05/4/1805/4/22

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Enhanced robot speech recognition based on microphone array source separation and missing feature theory'. Together they form a unique fingerprint.

  • Cite this

    Yamamoto, S., Valin, J. M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., & Okuno, H. G. (2005). Enhanced robot speech recognition based on microphone array source separation and missing feature theory. In Proceedings of the 2005 IEEE International Conference on Robotics and Automation (pp. 1477-1482). [1570323] (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2005). https://doi.org/10.1109/ROBOT.2005.1570323