Leak Energy Based Missing Feature Mask Generation for ICA and GSS and Its Evaluation with Simultaneous Speech Recognition

Shun'ichi Yamamoto*, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean Marc Valin, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Paper査読

抄録

This paper addresses automatic speech recognition (ASR) for robots integrated with sound source separation (SSS) by using leak noise based missing feature mask generation. The missing feature theory (MFT) is a promising approach to improve noise-robustness of ASR. An issue in MFT-based ASR is automatic generation of the missing feature mask. To improve robot audition, we applied this theory to interface ASR and SSS which extracts a sound source originated from a specific direction by multiple microphones. In a robot audition system, it is a promising approach to use SSS as a pre-processor for ASR to be able to deal with any kind of noises. However, ASR usually assumes clean speech input, while speech extracted by SSS never fails to be distorted. MFT can be applied to cope with distortion in the extracted speech. In this case, we can assume that the noises included in extracted sounds are mainly leakages from other channels. Thus, we introduced leak noise based missing feature mask generation, which can generate a missing feature mask automatically by using information on leak noise obtained from other channels. To assess the effectiveness of the leak noise based missing feature mask generation, we used two methods for SSS: geometric source separation (GSS) and independent component analysis (ICA), and Multiband Julian for MFT based ASR. The two constructed systems, that is, GSS-based and ICA-based robot audition systems, were evaluated through recognition of simultaneous speech uttered by two speakers. As a result, we showed that the proposed leak noise based missing feature mask generation worked well in both systems.

本文言語English
ページ42-47
ページ数6
出版ステータスPublished - 2006
外部発表はい
イベント2006 ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, SAPA 2006 - Pittsburgh, United States
継続期間: 2006 9月 16 → …

Conference

Conference2006 ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, SAPA 2006
国/地域United States
CityPittsburgh
Period06/9/16 → …

ASJC Scopus subject areas

  • 電子工学および電気工学
  • コンピュータ ビジョンおよびパターン認識
  • ソフトウェア

フィンガープリント

「Leak Energy Based Missing Feature Mask Generation for ICA and GSS and Its Evaluation with Simultaneous Speech Recognition」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル