SUPERB: Speech processing Universal PERformance Benchmark

Shu Wen Yang, Po Han Chi*, Yung Sung Chuang, Cheng I.Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan Ting Lin, Tzu Hsien Huang, Wei Cheng Tseng, Ko Tik Lee, Da Rong Liu, Zili Huang, Shuyan Dong, Shang Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung Yi Lee

*この研究の対応する著者

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for various tasks with minimal adaptation. However, the speech processing community lacks a similar setup to systematically explore the paradigm. To bridge this gap, we introduce Speech processing Universal PERformance Benchmark (SUPERB). SUPERB is a leaderboard to benchmark the performance of a shared model across a wide range of speech processing tasks with minimal architecture changes and labeled data. Among multiple usages of the shared model, we especially focus on extracting the representation learned from SSL for its preferable re-usability. We present a simple framework to solve SUPERB tasks by learning task-specialized lightweight prediction heads on top of the frozen shared model. Our results demonstrate that the framework is promising as SSL representations show competitive generalizability and accessibility across SUPERB tasks. We release SUPERB as a challenge with a leaderboard1 and a benchmark toolkit2 to fuel the research in representation learning and general speech processing.

本文言語English
ホスト出版物のタイトル22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
出版社International Speech Communication Association
ページ3161-3165
ページ数5
ISBN(電子版)9781713836902
DOI
出版ステータスPublished - 2021
外部発表はい
イベント22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic
継続期間: 2021 8月 302021 9月 3

出版物シリーズ

名前Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
4
ISSN(印刷版)2308-457X
ISSN(電子版)1990-9772

Conference

Conference22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
国/地域Czech Republic
CityBrno
Period21/8/3021/9/3

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「SUPERB: Speech processing Universal PERformance Benchmark」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル