Online meeting recognizer with multichannel speaker diarization

Shoko Araki*, Takaaki Hori, Masakiyo Fujimoto, Shinji Watanabe, Takuya Yoshioka, Tomohiro Nakatani, Atsushi Nakamura

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically "who speaks when and what" in an online manner. In our system, "who speaks when" information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, "who speaks what" is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers' voices. In this paper, we focus particularly on the speaker diarization ("who speaks when" estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.

Original languageEnglish
Title of host publicationConference Record of the 44th Asilomar Conference on Signals, Systems and Computers, Asilomar 2010
Pages1697-1701
Number of pages5
DOIs
Publication statusPublished - 2010 Dec 1
Externally publishedYes
Event44th Asilomar Conference on Signals, Systems and Computers, Asilomar 2010 - Pacific Grove, CA, United States
Duration: 2010 Nov 72010 Nov 10

Publication series

NameConference Record - Asilomar Conference on Signals, Systems and Computers
ISSN (Print)1058-6393

Other

Other44th Asilomar Conference on Signals, Systems and Computers, Asilomar 2010
Country/TerritoryUnited States
CityPacific Grove, CA
Period10/11/710/11/10

ASJC Scopus subject areas

  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Online meeting recognizer with multichannel speaker diarization'. Together they form a unique fingerprint.

Cite this