Improvement of lip reading performance in real environments using speaker and environmental adaptation

Takuya Kawasaki, Naoya Ukai, Seko Takumi, Satoshi Tamura, Satoru Hayamizu

研究成果査読

1 被引用数 (Scopus)

抄録

Lip reading technologies play a great role not only in image pattern recognition e.g. computer vision, but also in audio-visual pattern recognition e.g. bimodal speech recognition. However, it is a problem that the recognition accuracy is still significantly low, compared to that of speech recognition. Another problem lies which the performance degradation occurs in real environments. To improve the performance, in this paper we employ two adaptation schemes: speaker adaptation and environmental adaptation. The speaker adaptation is performed to recognition models so as to prevent the degradation caused by the difference of speakers. The environmental adaptation is also conducted to deal with environmental differences. We tested these adaptation schemes using a real-world audio-visual corpus CENSREC-2-AV, we have built this corpus containing real-world data (speech signals and lip images) recorded in a driving car, in which subjects uttered Japanese connected digits. Experimental results show that the lip reading recognition performance was largely improved by the speaker adaptation, and further recovered by the environmental adaptation.

本文言語English
ページ346-350
ページ数5
DOI
出版ステータスPublished - 2013
外部発表はい
イベント2013 2nd IAPR Asian Conference on Pattern Recognition, ACPR 2013 - Naha, Okinawa, Japan
継続期間: 2013 11 52013 11 8

Conference

Conference2013 2nd IAPR Asian Conference on Pattern Recognition, ACPR 2013
国/地域Japan
CityNaha, Okinawa
Period13/11/513/11/8

ASJC Scopus subject areas

  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「Improvement of lip reading performance in real environments using speaker and environmental adaptation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル