CMU’s IWSLT 2022 Dialect Speech Translation System

Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

This paper describes CMU’s submissions to the IWSLT 2022 dialect speech translation (ST) shared task for translating Tunisian-Arabic speech to English text. We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems. We also augment the paired ASR data with pseudo translations via sequence-level knowledge distillation from an MT model and use these artificial triplet ST data to improve our end-to-end (E2E) systems. Our E2E models are based on the Multi-Decoder architecture with searchable hidden intermediates. We extend the Multi-Decoder by orienting the speech encoder towards the target language by applying ST supervision as hierarchical connectionist temporal classification (CTC) multi-task. During inference, we apply joint decoding of the ST CTC and ST autoregressive decoder branches of our modified Multi-Decoder. Finally, we apply ROVER voting, posterior combination, and minimum bayes-risk decoding with combined N-best lists to ensemble our various cascaded and E2E systems. Our best systems reached 20.8 and 19.5 BLEU on test2 (blind) and test1 respectively Without any additional MSA data, we reached 20.4 and 19.2 on the same test sets.

本文言語English
ホスト出版物のタイトルIWSLT 2022 - 19th International Conference on Spoken Language Translation, Proceedings of the Conference
編集者Elizabeth Salesky, Marcello Federico, Marta Costa-Jussa
出版社Association for Computational Linguistics (ACL)
ページ298-307
ページ数10
ISBN(電子版)9781955917414
出版ステータスPublished - 2022
外部発表はい
イベント19th International Conference on Spoken Language Translation, IWSLT 2022 - Dublin, Ireland
継続期間: 2022 5月 262022 5月 27

出版物シリーズ

名前IWSLT 2022 - 19th International Conference on Spoken Language Translation, Proceedings of the Conference

Conference

Conference19th International Conference on Spoken Language Translation, IWSLT 2022
国/地域Ireland
CityDublin
Period22/5/2622/5/27

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 言語学および言語

フィンガープリント

「CMU’s IWSLT 2022 Dialect Speech Translation System」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル