Back-Translation-Style Data Augmentation for end-to-end ASR

Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramon Astudillo, Kazuya Takeda

研究成果

28 被引用数 (Scopus)

抄録

In this paper we propose a novel data augmentation method for attention-based end-to-end automatic speech recognition (E2E-ASR), utilizing a large amount of text which is not paired with speech signals. Inspired by the back-translation technique proposed in the field of machine translation, we build a neural text-to-encoder model which predicts a sequence of hidden states extracted by a pre-trained E2E-ASR encoder from a sequence of characters. By using hidden states as a target instead of acoustic features, it is possible to achieve faster attention learning and reduce computational cost, thanks to sub-sampling in E2E-ASR encoder, also the use of the hidden states can avoid to model speaker dependencies unlike acoustic features. After training, the text-to-encoder model generates the hidden states from a large amount of unpaired text, then E2E-ASR decoder is retrained using the generated hidden states as additional training data. Experimental evaluation using LibriSpeech dataset demonstrates that our proposed method achieves improvement of ASR performance and reduces the number of unknown words without the need for paired data.

本文言語English
ホスト出版物のタイトル2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ426-433
ページ数8
ISBN(電子版)9781538643341
DOI
出版ステータスPublished - 2019 2 11
外部発表はい
イベント2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Athens, Greece
継続期間: 2018 12 182018 12 21

出版物シリーズ

名前2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings

Conference

Conference2018 IEEE Spoken Language Technology Workshop, SLT 2018
国/地域Greece
CityAthens
Period18/12/1818/12/21

ASJC Scopus subject areas

  • コンピュータ ビジョンおよびパターン認識
  • 人間とコンピュータの相互作用
  • 言語学および言語

フィンガープリント

「Back-Translation-Style Data Augmentation for end-to-end ASR」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル