Multimodal interaction system that integrates speech and visual information

Satoru Hayamizu, Osamu Hasegawa, Katunobu Itou, Takashi Yoshimura, Tomoyoshi Akiba, Hideki Asoh, Shotaro Akaho, Takio Kurita, Katsuhiko Sakaue

研究成果査読

抄録

This paper presents the studies related to multimodal interaction systems. It also describes our new direction in the research, `Intermodal Learning'. The prototype system has four modes: vision, graphical display, speech recognition, and speech synthesis sub-systems, and an interaction manager. We demonstrated that it can learn user's face and name and the appearance and names of objects. A speech recognition technique to estimate phonetic transcriptions from multiple speech samples was used to learn new words. This is similar to a baby learning about the real world by communicating with its parents.

本文言語English
ページ(範囲)37-44
ページ数8
ジャーナルDenshi Gijutsu Sogo Kenkyusho Iho/Bulletin of the Electrotechnical Laboratory
64
4-5
出版ステータスPublished - 2000
外部発表はい

ASJC Scopus subject areas

  • 凝縮系物理学
  • 電子工学および電気工学

フィンガープリント

「Multimodal interaction system that integrates speech and visual information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル