Real-time meeting recognition and understanding using distant microphones and omni-directional camera

Takaaki Hori*, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato

*この研究の対応する著者

研究成果

9 被引用数 (Scopus)

抄録

This paper presents our newly developed real-time meeting analyzer for monitoring conversations in an ongoing group meeting. The goal of the system is to automatically recognize "who is speaking what" in an online manner for meeting assistance. Our system continuously captures the utterances and the face pose of each speaker using a distant microphone array and an omni-directional camera at the center of the meeting table. Through a series of advanced audio processing operations, an overlapping speech signal is enhanced and the components are separated into individual speaker's channels. Then the utterances are sequentially transcribed by our speech recognizer with low latency. In parallel with speech recognition, the activity of each participant (e.g. speaking, laughing, watching someone) and the situation of the meeting (e.g. topic, activeness, casualness) are detected and displayed on a browser together with the transcripts. In this paper, we describe our techniques and our attempt to achieve the low-latency monitoring of meetings, and we show our experimental results for real-time meeting transcription.

本文言語English
ホスト出版物のタイトル2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings
ページ424-429
ページ数6
DOI
出版ステータスPublished - 2010
外部発表はい
イベント2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Berkeley, CA, United States
継続期間: 2010 12 122010 12 15

出版物シリーズ

名前2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings

Other

Other2010 IEEE Workshop on Spoken Language Technology, SLT 2010
国/地域United States
CityBerkeley, CA
Period10/12/1210/12/15

ASJC Scopus subject areas

  • 言語および言語学

フィンガープリント

「Real-time meeting recognition and understanding using distant microphones and omni-directional camera」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル