Espresso: A Fast End-To-End Neural Speech Recognition Toolkit

Yiming Wang, Sanjeev Khudanpur, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

We present Espresso, an open-source, modular, extensible end-To-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit FAIRSEQ. ESRESSO supports distributed training across GPUs and computing nodes, and features various decoding approaches commonly employed in ASR, including look-Ahead word-based language model fusion, for which a fast, parallelized decoder is implemented. Espresso achieves state-of-The-Art ASR performance on the WSJ, LibriSpeech, and Switchboard data sets among other end-To-end systems without data augmentation, and is 4-11x faster for decoding than similar systems (e.g. ESPNET).

本文言語English
ホスト出版物のタイトル2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ136-143
ページ数8
ISBN(電子版)9781728103068
DOI
出版ステータスPublished - 2019 12
外部発表はい
イベント2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Singapore, Singapore
継続期間: 2019 12 152019 12 18

出版物シリーズ

名前2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings

Conference

Conference2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
CountrySingapore
CitySingapore
Period19/12/1519/12/18

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Signal Processing
  • Linguistics and Language
  • Communication

フィンガープリント 「Espresso: A Fast End-To-End Neural Speech Recognition Toolkit」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル