Speaker recognition benchmark using the CHiME-5 corpus

Daniel Garcia-Romero, David Snyder, Shinji Watanabe, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur

研究成果: Conference article査読

5 被引用数 (Scopus)

抄録

In this paper, we introduce a speaker recognition benchmark derived from the publicly-available CHiME-5 corpus. Our goal is to foster research that tackles the challenging artifacts introduced by far-field multi-speaker recordings of naturally occurring spoken interactions. The benchmark comprises four tasks that involve enrollment and test conditions with single-speaker and/or multi-speaker recordings. Additionally, it supports performance comparisons between close-talking vs distant/far-field microphone recordings, and single-microphone vs microphone-array approaches. We validate the evaluation design with a single-microphone state-of-the-art DNN speaker recognition and diarization system (that we are making publicly available). The results show that the proposed tasks are very challenging, and can be used to quantify the performance gap due to the degradations present in far-field multi-speaker recordings.

本文言語English
ページ(範囲)1506-1510
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2019-September
DOI
出版ステータスPublished - 2019
外部発表はい
イベント20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
継続期間: 2019 9月 152019 9月 19

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Speaker recognition benchmark using the CHiME-5 corpus」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル