An improvement in automatic speech recognition using soft missing feature masks for robot audition

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We describe integration of preprocessing and automatic speech recognition based on Missing-Feature-Theory (MFT) to recognize a highly interfered speech signal, such as the signal in a narrow angle between a desired and interfered speakers. As a speech signal separated from a mixture of speech signals includes the leakage from other speech signals, recognition performance of the separated speech degrades. An important problem is estimating the leakage in time-frequency components. Once the leakage is estimated, we can generate missing feature masks (MFM) automatically by using our method. A new weighted sigmoid function is introduced for our MFM generation method. An experiment shows that a word correct rate improves from 66 % to 74 % by using our MFM generation method tuned by a search base approach in the parameter space.

Original languageEnglish
Title of host publicationIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings
Pages964-969
Number of pages6
DOIs
Publication statusPublished - 2010 Dec 1
Event23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Taipei, Taiwan, Province of China
Duration: 2010 Oct 182010 Oct 22

Publication series

NameIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings

Conference

Conference23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010
CountryTaiwan, Province of China
CityTaipei
Period10/10/1810/10/22

    Fingerprint

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Control and Systems Engineering

Cite this

Takahashi, T., Nakadai, K., Komatani, K., Ogata, T., & Okuno, H. G. (2010). An improvement in automatic speech recognition using soft missing feature masks for robot audition. In IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings (pp. 964-969). [5650540] (IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings). https://doi.org/10.1109/IROS.2010.5650540