Real-time meeting recognition and understanding using distant microphones and omni-directional camera

Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

This paper presents our newly developed real-time meeting analyzer for monitoring conversations in an ongoing group meeting. The goal of the system is to automatically recognize "who is speaking what" in an online manner for meeting assistance. Our system continuously captures the utterances and the face pose of each speaker using a distant microphone array and an omni-directional camera at the center of the meeting table. Through a series of advanced audio processing operations, an overlapping speech signal is enhanced and the components are separated into individual speaker's channels. Then the utterances are sequentially transcribed by our speech recognizer with low latency. In parallel with speech recognition, the activity of each participant (e.g. speaking, laughing, watching someone) and the situation of the meeting (e.g. topic, activeness, casualness) are detected and displayed on a browser together with the transcripts. In this paper, we describe our techniques and our attempt to achieve the low-latency monitoring of meetings, and we show our experimental results for real-time meeting transcription.

Original languageEnglish
Title of host publication2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings
Pages424-429
Number of pages6
DOIs
Publication statusPublished - 2010 Dec 1
Externally publishedYes
Event2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Berkeley, CA, United States
Duration: 2010 Dec 122010 Dec 15

Publication series

Name2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings

Other

Other2010 IEEE Workshop on Spoken Language Technology, SLT 2010
CountryUnited States
CityBerkeley, CA
Period10/12/1210/12/15

Keywords

  • Distant microphones
  • Meeting analysis
  • Speaker diarization
  • Speech enhancement
  • Speech recognition
  • Topic tracking

ASJC Scopus subject areas

  • Language and Linguistics

Fingerprint Dive into the research topics of 'Real-time meeting recognition and understanding using distant microphones and omni-directional camera'. Together they form a unique fingerprint.

  • Cite this

    Hori, T., Araki, S., Yoshioka, T., Fujimoto, M., Watanabe, S., Oba, T., Ogawa, A., Otsuka, K., Mikami, D., Kinoshita, K., Nakatani, T., Nakamura, A., & Yamato, J. (2010). Real-time meeting recognition and understanding using distant microphones and omni-directional camera. In 2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings (pp. 424-429). [5700890] (2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings). https://doi.org/10.1109/SLT.2010.5700890