Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data

Shinji Watanabe*, Takaaki Hori, Atsushi Nakamura

*この研究の対応する著者

研究成果: Conference contribution

12 被引用数 (Scopus)

抄録

This paper describes a discriminative approach that further advances the framework for Weighted Finite State Transducer (WFST) based decoding. The approach introduces additional linear models for adjusting the scores of a decoding graph composed of conventional information source models (e.g., hidden Markov models and N-gram models), and reviews the WFST-based decoding process as a linear classifier for structured data (e.g., sequential multiclass data). The difficulty with the approach is that the number of dimensions of the additional linear models becomes very large in proportion to the number of arcs in a WFST, and our previous study only applied it to a small task (TIMIT phoneme recognition). This paper proposes a training method for a large-scale linear classifier employed in WFST-based decoding by using a distributed perceptron algorithm. The experimental results show that the proposed approach was successfully applied to a large vocabulary continuous speech recognition task, and achieved an improvement compared with the performance of the minimum phone error based discriminative training of acoustic models.

本文言語English
ホスト出版物のタイトルProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
出版社International Speech Communication Association
ページ346-349
ページ数4
出版ステータスPublished - 2010
外部発表はい

出版物シリーズ

名前Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

ASJC Scopus subject areas

  • 言語および言語学
  • 言語聴覚療法
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル