Structured discriminative models for speech recognition: An overview

Mark John Francis Gales*, Shinji Watanabe, Eric Fosler-Lussier

*この研究の対応する著者

研究成果: Review article査読

23 被引用数 (Scopus)

抄録

Automatic speech recognition (ASR) systems classify structured sequence data, where the label sequences (sentences) must be inferred from the observation sequences (the acoustic waveform). The sequential nature of the task is one of the reasons why generative classifiers, based on combining hidden Markov model (HMM) acoustic models and N-gram language models using Bayes rule, have become the dominant technology used in ASR. Conversely, machine learning and natural language processing (NLP) research areas are increasingly dominated by discriminative approaches, where the class posteriors are directly modeled. This article describes recent work in the area of structured discriminative models for ASR. To handle continuous, variable length observation sequences, the approaches applied to NLP tasks must be modified. This article discusses a variety of approaches for applying structured discriminative models to ASR, both from the current literature and possible future approaches. We concentrate on structured models themselves, the descriptive features of observations commonly used within the models, and various options for optimizing the parameters of the model.

本文言語English
論文番号6296527
ページ(範囲)70-81
ページ数12
ジャーナルIEEE Signal Processing Magazine
29
6
DOI
出版ステータスPublished - 2012
外部発表はい

ASJC Scopus subject areas

  • 信号処理
  • 電子工学および電気工学
  • 応用数学

フィンガープリント

「Structured discriminative models for speech recognition: An overview」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル