VocalTurk: Exploring feasibility of crowdsourced speaker identification

Susumu Saito, Yuta Ide, Teppei Nakano, Tetsuji Ogawa

研究成果: Conference contribution

抄録

This paper presents VocalTurk, a feasibility study of crowdsourced speaker identification based on our worker dataset collected in Amazon Mechanical Turk. Crowdsourced data labeling has already been acknowledged in speech data processing nowadays, but empirical analysis that answer to common questions such as “how accurate are workers capable of labeling speech data?” and “what does a good speech-labeling microtask interface look like?” still remain underexplored, which would limit the quality and scale of the dataset collection. Focusing on the speaker identification task in particular, we thus conducted two studies in Amazon Mechanical Turk: i) hired 3,800+ unique workers to test their performances and confidences in giving answers to voice pair comparison tasks, and ii) additionally assigned more-difficult tasks of 1-vs-N voice set comparisons to 350+ top-scoring workers to test their accuracy-speed performances across patterns of N = {1, 3, 5}. The results revealed some positive findings that would motivate speech researchers toward crowdsourced data labeling, such as that the top-scoring workers were capable of giving labels to our voice comparison pairs with 99% accuracy after majority voting, as well as they were even capable of batch-labeling which significantly shortened up to 34% of their completion time but still with no statistically-significant degradation in accuracy.

本文言語English
ホスト出版物のタイトル22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
出版社International Speech Communication Association
ページ2932-2936
ページ数5
ISBN(電子版)9781713836902
DOI
出版ステータスPublished - 2021
イベント22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic
継続期間: 2021 8月 302021 9月 3

出版物シリーズ

名前Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
4
ISSN(印刷版)2308-457X
ISSN(電子版)1990-9772

Conference

Conference22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
国/地域Czech Republic
CityBrno
Period21/8/3021/9/3

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「VocalTurk: Exploring feasibility of crowdsourced speaker identification」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル