Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other known interference sources. The semi-BSS technique is necessary for double-talk free robot audition systems in order to utilize known sound source signals such as self speech, music, or TV-sound, through a line-in or ubiquitous network. Unlike the conventional semi-BSS with ICA, we use the time-frequency domain convolution model to describe the reflection of the sound and a new mixing process of sounds for ICA. In other words, we consider that reflected sounds during some delay time are different from the original. ICA then separates the reflections as other interference sources. The model enables us to eliminate the frame size limitations of the frequency-domain ICA, and ICA can separate the known sources under a highly reverberative environment. Experimental results show that our method outperformed the conventional semi-BSS using ICA under simulated normal and highly reverberative environments.

Original languageEnglish
Title of host publicationIEEE International Conference on Intelligent Robots and Systems
Pages1757-1762
Number of pages6
DOIs
Publication statusPublished - 2007
Externally publishedYes
Event2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007 - San Diego, CA
Duration: 2007 Oct 292007 Nov 2

Other

Other2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007
CitySan Diego, CA
Period07/10/2907/11/2

Fingerprint

Independent component analysis
Audition
Acoustic waves
Robots
Blind source separation
Convolution
Time delay

ASJC Scopus subject areas

  • Control and Systems Engineering

Cite this

Takeda, R., Nakadai, K., Komatani, K., Ogata, T., & Okuno, H. G. (2007). Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition. In IEEE International Conference on Intelligent Robots and Systems (pp. 1757-1762). [4399297] https://doi.org/10.1109/IROS.2007.4399297

Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition. / Takeda, Ryu; Nakadai, Kazuhiro; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

IEEE International Conference on Intelligent Robots and Systems. 2007. p. 1757-1762 4399297.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Takeda, R, Nakadai, K, Komatani, K, Ogata, T & Okuno, HG 2007, Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition. in IEEE International Conference on Intelligent Robots and Systems., 4399297, pp. 1757-1762, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007, San Diego, CA, 07/10/29. https://doi.org/10.1109/IROS.2007.4399297
Takeda R, Nakadai K, Komatani K, Ogata T, Okuno HG. Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition. In IEEE International Conference on Intelligent Robots and Systems. 2007. p. 1757-1762. 4399297 https://doi.org/10.1109/IROS.2007.4399297
Takeda, Ryu ; Nakadai, Kazuhiro ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition. IEEE International Conference on Intelligent Robots and Systems. 2007. pp. 1757-1762
@inproceedings{612629b687df4ac09d6549ada23348b9,
title = "Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition",
abstract = "This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other known interference sources. The semi-BSS technique is necessary for double-talk free robot audition systems in order to utilize known sound source signals such as self speech, music, or TV-sound, through a line-in or ubiquitous network. Unlike the conventional semi-BSS with ICA, we use the time-frequency domain convolution model to describe the reflection of the sound and a new mixing process of sounds for ICA. In other words, we consider that reflected sounds during some delay time are different from the original. ICA then separates the reflections as other interference sources. The model enables us to eliminate the frame size limitations of the frequency-domain ICA, and ICA can separate the known sources under a highly reverberative environment. Experimental results show that our method outperformed the conventional semi-BSS using ICA under simulated normal and highly reverberative environments.",
author = "Ryu Takeda and Kazuhiro Nakadai and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2007",
doi = "10.1109/IROS.2007.4399297",
language = "English",
isbn = "1424409128",
pages = "1757--1762",
booktitle = "IEEE International Conference on Intelligent Robots and Systems",

}

TY - GEN

T1 - Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition

AU - Takeda, Ryu

AU - Nakadai, Kazuhiro

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2007

Y1 - 2007

N2 - This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other known interference sources. The semi-BSS technique is necessary for double-talk free robot audition systems in order to utilize known sound source signals such as self speech, music, or TV-sound, through a line-in or ubiquitous network. Unlike the conventional semi-BSS with ICA, we use the time-frequency domain convolution model to describe the reflection of the sound and a new mixing process of sounds for ICA. In other words, we consider that reflected sounds during some delay time are different from the original. ICA then separates the reflections as other interference sources. The model enables us to eliminate the frame size limitations of the frequency-domain ICA, and ICA can separate the known sources under a highly reverberative environment. Experimental results show that our method outperformed the conventional semi-BSS using ICA under simulated normal and highly reverberative environments.

AB - This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other known interference sources. The semi-BSS technique is necessary for double-talk free robot audition systems in order to utilize known sound source signals such as self speech, music, or TV-sound, through a line-in or ubiquitous network. Unlike the conventional semi-BSS with ICA, we use the time-frequency domain convolution model to describe the reflection of the sound and a new mixing process of sounds for ICA. In other words, we consider that reflected sounds during some delay time are different from the original. ICA then separates the reflections as other interference sources. The model enables us to eliminate the frame size limitations of the frequency-domain ICA, and ICA can separate the known sources under a highly reverberative environment. Experimental results show that our method outperformed the conventional semi-BSS using ICA under simulated normal and highly reverberative environments.

UR - http://www.scopus.com/inward/record.url?scp=51349166253&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51349166253&partnerID=8YFLogxK

U2 - 10.1109/IROS.2007.4399297

DO - 10.1109/IROS.2007.4399297

M3 - Conference contribution

AN - SCOPUS:51349166253

SN - 1424409128

SN - 9781424409129

SP - 1757

EP - 1762

BT - IEEE International Conference on Intelligent Robots and Systems

ER -