A speaker diarization system with robust speaker localization and voice activity detection

Yangyang Huang, Takuma Otsuka, Hiroshi G. Okuno

研究成果: Chapter

2 引用 (Scopus)

抜粋

In real-world auditory scene analysis of human-robot interactions, three types of information are essential and need to be extracted from the observation data - who speaks when and where. We present a speaker diarization system that is used to accomplish the resolution. Multiple signal classification (MUSIC) is a powerful method for voice activity detection (VAD) and direction of arrival (DOA) estimation. We propose our system and compare its performance in VAD and DOA with the method based on MUSIC algorithm.

元の言語English
ホスト出版物のタイトルStudies in Computational Intelligence
ページ77-82
ページ数6
489
DOI
出版物ステータスPublished - 2013
外部発表Yes

出版物シリーズ

名前Studies in Computational Intelligence
489
ISSN(印刷物)1860949X

    フィンガープリント

ASJC Scopus subject areas

  • Artificial Intelligence

これを引用

Huang, Y., Otsuka, T., & Okuno, H. G. (2013). A speaker diarization system with robust speaker localization and voice activity detection. : Studies in Computational Intelligence (巻 489, pp. 77-82). (Studies in Computational Intelligence; 巻数 489). https://doi.org/10.1007/978-3-319-00651-2-11