End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection

Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola Garcia, Kenji Nagamatsu

研究成果: Conference contribution

抄録

In this paper, we present a conditional multitask learning method for end-to-end neural speaker diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlapping speech. In this paper, to further improve the performance of the EEND system, we propose a novel multitask learning framework that solves speaker diarization and a desired subtask while explicitly considering the task dependency. We optimize speaker diarization conditioned on speech activity and overlap detection that are subtasks of speaker diarization, based on the probabilistic chain rule. Experimental results show that our proposed method can leverage a subtask to effectively model speaker diarization, and outperforms conventional EEND systems in terms of diarization error rate.

本文言語English
ホスト出版物のタイトル2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ849-856
ページ数8
ISBN(電子版)9781728170664
DOI
出版ステータスPublished - 2021 1 19
イベント2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Virtual, Shenzhen, China
継続期間: 2021 1 192021 1 22

出版物シリーズ

名前2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings

Conference

Conference2021 IEEE Spoken Language Technology Workshop, SLT 2021
CountryChina
CityVirtual, Shenzhen
Period21/1/1921/1/22

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture

フィンガープリント 「End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル