Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR

Ryu Takeda, Shun'ichi Yamamoto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages384-394
Number of pages11
Volume4570 LNAI
Publication statusPublished - 2007
Externally publishedYes
Event20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007 - Kyoto
Duration: 2007 Jun 262007 Jun 29

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4570 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007
CityKyoto
Period07/6/2607/6/29

Fingerprint

Continuous speech recognition
Blind source separation
Independent component analysis
Independent Component Analysis
Speech Recognition
Acoustic waves
Masks
Evaluation
Mask
Audition
Acoustics
Acoustic Model
Hearing
Feature Vector
Robots
Frequency Domain
Baseline
Robot
Configuration
Estimate

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Takeda, R., Yamamoto, S., Komatani, K., Ogata, T., & Okuno, H. G. (2007). Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4570 LNAI, pp. 384-394). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4570 LNAI).

Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR. / Takeda, Ryu; Yamamoto, Shun'ichi; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI 2007. p. 384-394 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4570 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Takeda, R, Yamamoto, S, Komatani, K, Ogata, T & Okuno, HG 2007, Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4570 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4570 LNAI, pp. 384-394, 20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007, Kyoto, 07/6/26.
Takeda R, Yamamoto S, Komatani K, Ogata T, Okuno HG. Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI. 2007. p. 384-394. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Takeda, Ryu ; Yamamoto, Shun'ichi ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI 2007. pp. 384-394 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{d714c5e76f814890b3aaead87d3db220,
title = "Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR",
abstract = "An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.",
author = "Ryu Takeda and Shun'ichi Yamamoto and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2007",
language = "English",
isbn = "9783540733225",
volume = "4570 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "384--394",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Evaluation of two simultaneous continuous speech recognition with ICA BSS and MFT-based ASR

AU - Takeda, Ryu

AU - Yamamoto, Shun'ichi

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2007

Y1 - 2007

N2 - An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.

AB - An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.

UR - http://www.scopus.com/inward/record.url?scp=37349051121&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37349051121&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9783540733225

VL - 4570 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 384

EP - 394

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -