An improvement in audio-visual voice activity detection for automatic speech recognition

Takami Yoshida, Kazuhiro Nakadai, Hiroshi G. Okuno

研究成果: Conference contribution

6 引用 (Scopus)

抜粋

Noise-robust Automatic Speech Recognition (ASR) is essential for robots which are expected to communicate with humans in a daily environment. In such an environment, Voice Activity Detection (VAD) strongly affects the performance of ASR because there are many acoustically and visually noises. In this paper, we improved Audio-Visual VAD for our two-layered audio visual integration framework for ASR by using hangover processing based on erosion and dilation. We implemented proposed method to our audio-visual speech recognition system for robot. Empirical results show the effectiveness of our proposed method in terms of VAD.

元の言語English
ホスト出版物のタイトルLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ページ51-61
ページ数11
6096 LNAI
エディションPART 1
DOI
出版物ステータスPublished - 2010
外部発表Yes
イベント23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligence Systems, IEA/AIE 2010 - Cordoba
継続期間: 2010 6 12010 6 4

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
番号PART 1
6096 LNAI
ISSN(印刷物)03029743
ISSN(電子版)16113349

Other

Other23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligence Systems, IEA/AIE 2010
Cordoba
期間10/6/110/6/4

    フィンガープリント

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

これを引用

Yoshida, T., Nakadai, K., & Okuno, H. G. (2010). An improvement in audio-visual voice activity detection for automatic speech recognition. : Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 1 版, 巻 6096 LNAI, pp. 51-61). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻数 6096 LNAI, 番号 PART 1). https://doi.org/10.1007/978-3-642-13022-9_6