Recognition of simultaneous speech by estimating reliability of separated signals for robot audition

Shun'ichi Yamamoto*, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean Marc Valin, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

"Listening to several things at once" is a people's dream and one goal of AI and robot audition, because people can listen to at most two things at once according to psychophysical observations. Current noise reduction techniques cannot help to achieve this goal because they assume quasi-stationary noises, not interfering speech signals. Since robots are used in various environments, robot audition systems require minimum a priori information about their acoustic environments and speakers. We evaluate a missing feature theory approach that interfaces between sound source separation (SSS) and automatic speech recognition. The essential part is the estimate of reliability of each feature of separated sounds. We tested two kinds of robot audition systems that use SSS: independent component analysis (ICA) with two microphones, and geometric source separation (GSS) with eight microphones. For each SSS, automatic missing feature mask generation is developed. The recognition accuracy of two simultaneous speech improved to an average of 67.8 and 88.0% for ICA and GSS, respectively.

Original languageEnglish
Title of host publicationPRICAI 2006
Subtitle of host publicationTrends in Artificial Intelligence - 9th Pacific Rim International Conference on Artificial Intelligence, Proceedings
PublisherSpringer Verlag
Pages484-494
Number of pages11
ISBN (Print)3540366679, 9783540366676
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event9th Pacific Rim International Conference on Artificial Intelligence - Guilin, China
Duration: 2006 Aug 72006 Aug 11

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4099 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th Pacific Rim International Conference on Artificial Intelligence
Country/TerritoryChina
CityGuilin
Period06/8/706/8/11

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Recognition of simultaneous speech by estimating reliability of separated signals for robot audition'. Together they form a unique fingerprint.

Cite this