Data collection for mobile audio-visual speech recognition in various environments

Satoshi Tamura, Takumi Seko, Satoru Hayamizu

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

This paper introduces our recent activities for audio-visual speech recognition on mobile devices and data collection in various environments. Audio-visual automatic speech recognition is effective in noisy or real conditions to enhance the robustness of speech recognizer and to improve the recognition accuracy. We have developed an audio-visual speech recognition interface for mobile devices. In order to evaluate the recognizer and investigate issues related to audio-visual processing on mobile computers, we collected speech data and lip images of 16 subjects in eight conditions, where there were various audio noises and visual difficulties. Audio-only speech recognition and visual-only lipreading were then conducted. Through these experiments, we found some issues and future works not only for construction of audio-visual database but also for robust audio-visual speech recognition.

本文言語English
ホスト出版物のタイトルOriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781479970940
DOI
出版ステータスPublished - 2014 2 27
外部発表はい
イベント17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014 - Phuket, Thailand
継続期間: 2014 9 102014 9 12

出版物シリーズ

名前Oriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)

Other

Other17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014
国/地域Thailand
CityPhuket
Period14/9/1014/9/12

ASJC Scopus subject areas

  • ソフトウェア
  • コンピュータ サイエンスの応用
  • 言語および言語学
  • 言語学および言語

フィンガープリント

「Data collection for mobile audio-visual speech recognition in various environments」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル