Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

This paper describes a robot audition system that allows the user to barge-in; that is, the user can speak simultaneously when the robot is speaking. Our "barge-in-able" system consists of two stages: (1) cancellation of robot speech and (2) recognition of the separated user speech under the "semi-blind situation". The semi-blind situation is where a robot's speech signal is known but a user's speech signal is not. The first stage is achieved by using an adaptive filter based on time-frequency domain Independent Component Analysis, because that can separate robot speech more robustly against noise than conventional echo cancellers. To improve performance in online processing, we utilized known source normalization and the exponentially weighted stepsize method. The second stage is achieved by automatic speech recognition (ASR) based on the missing feature theory which provides robust recognition by exploiting the reliability of speech features distorted due to noise and/or separation. The semi-blind situation simplifies the estimation of such reliabilities. Experiments demonstrated that our system improved word correctness of ASR by 10.0 %.

Original languageEnglish
Title of host publication2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
Pages1718-1723
Number of pages6
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS - Nice
Duration: 2008 Sep 222008 Sep 26

Other

Other2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
CityNice
Period08/9/2208/9/26

Fingerprint

Barges
Independent component analysis
Audition
Robots
Speech recognition
Adaptive filters
Processing
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Takeda, R., Nakadai, K., Komatani, K., Ogata, T., & Okuno, H. G. (2008). Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (pp. 1718-1723). [4650799] https://doi.org/10.1109/IROS.2008.4650799

Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation. / Takeda, Ryu; Nakadai, Kazuhiro; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS. 2008. p. 1718-1723 4650799.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Takeda, R, Nakadai, K, Komatani, K, Ogata, T & Okuno, HG 2008, Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation. in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS., 4650799, pp. 1718-1723, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, Nice, 08/9/22. https://doi.org/10.1109/IROS.2008.4650799
Takeda R, Nakadai K, Komatani K, Ogata T, Okuno HG. Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS. 2008. p. 1718-1723. 4650799 https://doi.org/10.1109/IROS.2008.4650799
Takeda, Ryu ; Nakadai, Kazuhiro ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation. 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS. 2008. pp. 1718-1723
@inproceedings{bce44526d95a46518ad3ff69ed284760,
title = "Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation",
abstract = "This paper describes a robot audition system that allows the user to barge-in; that is, the user can speak simultaneously when the robot is speaking. Our {"}barge-in-able{"} system consists of two stages: (1) cancellation of robot speech and (2) recognition of the separated user speech under the {"}semi-blind situation{"}. The semi-blind situation is where a robot's speech signal is known but a user's speech signal is not. The first stage is achieved by using an adaptive filter based on time-frequency domain Independent Component Analysis, because that can separate robot speech more robustly against noise than conventional echo cancellers. To improve performance in online processing, we utilized known source normalization and the exponentially weighted stepsize method. The second stage is achieved by automatic speech recognition (ASR) based on the missing feature theory which provides robust recognition by exploiting the reliability of speech features distorted due to noise and/or separation. The semi-blind situation simplifies the estimation of such reliabilities. Experiments demonstrated that our system improved word correctness of ASR by 10.0 {\%}.",
author = "Ryu Takeda and Kazuhiro Nakadai and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2008",
doi = "10.1109/IROS.2008.4650799",
language = "English",
isbn = "9781424420582",
pages = "1718--1723",
booktitle = "2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS",

}

TY - GEN

T1 - Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation

AU - Takeda, Ryu

AU - Nakadai, Kazuhiro

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2008

Y1 - 2008

N2 - This paper describes a robot audition system that allows the user to barge-in; that is, the user can speak simultaneously when the robot is speaking. Our "barge-in-able" system consists of two stages: (1) cancellation of robot speech and (2) recognition of the separated user speech under the "semi-blind situation". The semi-blind situation is where a robot's speech signal is known but a user's speech signal is not. The first stage is achieved by using an adaptive filter based on time-frequency domain Independent Component Analysis, because that can separate robot speech more robustly against noise than conventional echo cancellers. To improve performance in online processing, we utilized known source normalization and the exponentially weighted stepsize method. The second stage is achieved by automatic speech recognition (ASR) based on the missing feature theory which provides robust recognition by exploiting the reliability of speech features distorted due to noise and/or separation. The semi-blind situation simplifies the estimation of such reliabilities. Experiments demonstrated that our system improved word correctness of ASR by 10.0 %.

AB - This paper describes a robot audition system that allows the user to barge-in; that is, the user can speak simultaneously when the robot is speaking. Our "barge-in-able" system consists of two stages: (1) cancellation of robot speech and (2) recognition of the separated user speech under the "semi-blind situation". The semi-blind situation is where a robot's speech signal is known but a user's speech signal is not. The first stage is achieved by using an adaptive filter based on time-frequency domain Independent Component Analysis, because that can separate robot speech more robustly against noise than conventional echo cancellers. To improve performance in online processing, we utilized known source normalization and the exponentially weighted stepsize method. The second stage is achieved by automatic speech recognition (ASR) based on the missing feature theory which provides robust recognition by exploiting the reliability of speech features distorted due to noise and/or separation. The semi-blind situation simplifies the estimation of such reliabilities. Experiments demonstrated that our system improved word correctness of ASR by 10.0 %.

UR - http://www.scopus.com/inward/record.url?scp=69549122134&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69549122134&partnerID=8YFLogxK

U2 - 10.1109/IROS.2008.4650799

DO - 10.1109/IROS.2008.4650799

M3 - Conference contribution

SN - 9781424420582

SP - 1718

EP - 1723

BT - 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS

ER -