Speaker verification-based evaluation of single-channel speech separation

Matthew Maciejewski, Shinji Watanabe, Sanjeev Khudanpur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech enhancement techniques typically focus on intrinsic metrics of signal quality. The overwhelming majority of deep learning-based single-channel speech separation studies, for instance, have relied on a single class of metrics to evaluate the systems by. These metrics, usually variants of Signal-to-Distortion Ratio (SDR), measure fidelity to the “ground truth” waveform. This can be problematic, not only for lack of diversity in evaluation metrics, but also in cases where a perfect ground truth waveform may be unavailable. In this work, we explore the value of speaker verification as an extrinsic metric of separation quality, with additional utility as evidence of the benefits of separation as pre-processing for downstream tasks.

Original languageEnglish
Title of host publication22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PublisherInternational Speech Communication Association
Pages2353-2357
Number of pages5
ISBN (Electronic)9781713836902
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic
Duration: 2021 Aug 302021 Sep 3

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume3
ISSN (Print)2308-457X
ISSN (Electronic)1990-9772

Conference

Conference22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Country/TerritoryCzech Republic
CityBrno
Period21/8/3021/9/3

Keywords

  • Speaker verification
  • speech separation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Speaker verification-based evaluation of single-channel speech separation'. Together they form a unique fingerprint.

Cite this