A DOA based speaker diarization system for real meetings

Shoko Araki*, Masakiyo Fujimoto, Kentaro Ishizuka, Hiroshi Sawada, Shoji Makino

*この研究の対応する著者

研究成果: Conference contribution

23 被引用数 (Scopus)

抄録

This paper presents a speaker diarization system that estimates who spoke when in a meeting. Our proposed system is realized by using a noise robust voice activity detector (VAD), a direction of arrival (DOA) estimator, and a DOA classifier. Our previous system utilized the generalized cross correlation method with the phase transform (GCC-PHAT) approach for the DOA estimation. Because the GCC-PHAT can estimate just one DOA per frame, it was difficult to handle speaker overlaps. This paper tries to deal with this issue by employing a DOA at each time-frequency slot (TFDOA), and reports how it improves diarization performance for real meetings / conversations recorded in a room with a reverberation time of 350 ms.

本文言語English
ホスト出版物のタイトル2008 Hands-free Speech Communication and Microphone Arrays, Proceedings, HSCMA 2008
ページ29-32
ページ数4
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 Hands-free Speech Communication and Microphone Arrays, HSCMA 2008 - Trento, Italy
継続期間: 2008 5 62008 5 8

出版物シリーズ

名前2008 Hands-free Speech Communication and Microphone Arrays, Proceedings, HSCMA 2008

Conference

Conference2008 Hands-free Speech Communication and Microphone Arrays, HSCMA 2008
国/地域Italy
CityTrento
Period08/5/608/5/8

ASJC Scopus subject areas

  • ハードウェアとアーキテクチャ
  • 電子工学および電気工学
  • 通信

フィンガープリント

「A DOA based speaker diarization system for real meetings」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル