TY - GEN
T1 - An improvement in audio-visual voice activity detection for automatic speech recognition
AU - Yoshida, Takami
AU - Nakadai, Kazuhiro
AU - Okuno, Hiroshi G.
PY - 2010
Y1 - 2010
N2 - Noise-robust Automatic Speech Recognition (ASR) is essential for robots which are expected to communicate with humans in a daily environment. In such an environment, Voice Activity Detection (VAD) strongly affects the performance of ASR because there are many acoustically and visually noises. In this paper, we improved Audio-Visual VAD for our two-layered audio visual integration framework for ASR by using hangover processing based on erosion and dilation. We implemented proposed method to our audio-visual speech recognition system for robot. Empirical results show the effectiveness of our proposed method in terms of VAD.
AB - Noise-robust Automatic Speech Recognition (ASR) is essential for robots which are expected to communicate with humans in a daily environment. In such an environment, Voice Activity Detection (VAD) strongly affects the performance of ASR because there are many acoustically and visually noises. In this paper, we improved Audio-Visual VAD for our two-layered audio visual integration framework for ASR by using hangover processing based on erosion and dilation. We implemented proposed method to our audio-visual speech recognition system for robot. Empirical results show the effectiveness of our proposed method in terms of VAD.
KW - Audio-Visual integration
KW - Speech Recognition
KW - Voice Activity Detection
UR - http://www.scopus.com/inward/record.url?scp=79551526836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79551526836&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-13022-9_6
DO - 10.1007/978-3-642-13022-9_6
M3 - Conference contribution
AN - SCOPUS:79551526836
SN - 3642130216
SN - 9783642130212
VL - 6096 LNAI
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 51
EP - 61
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
T2 - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligence Systems, IEA/AIE 2010
Y2 - 1 June 2010 through 4 June 2010
ER -