Multilingual End-To-End Speech Translation

Hirofumi Inaguma, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe

研究成果: Conference contribution

27 被引用数 (Scopus)

抄録

In this paper, we propose a simple yet effective framework for multilingual end-To-end speech translation (ST), in which speech utterances in source languages are directly translated to the desired target languages with a universal sequence-To-sequence architecture. While multilingual models have shown to be useful for automatic speech recognition (ASR) and machine translation (MT), this is the first time they are applied to the end-To-end ST problem. We show the effectiveness of multilingual end-To-end ST in two scenarios: one-To-many and many-To-many translations with publicly available data. We experimentally confirm that multilingual end-To-end ST models significantly outperform bilingual ones in both scenarios. The generalization of multilingual training is also evaluated in a transfer learning scenario to a very low-resource language pair. All of our codes and the database are publicly available to encourage further research in this emergent multilingual ST topic11Available at https://github.com/espnet/espnet.

本文言語English
ホスト出版物のタイトル2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ570-577
ページ数8
ISBN(電子版)9781728103068
DOI
出版ステータスPublished - 2019 12月
外部発表はい
イベント2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Singapore, Singapore
継続期間: 2019 12月 152019 12月 18

出版物シリーズ

名前2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings

Conference

Conference2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
国/地域Singapore
CitySingapore
Period19/12/1519/12/18

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • 信号処理
  • 言語学および言語
  • 通信

フィンガープリント

「Multilingual End-To-End Speech Translation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル