Decoding network optimization using minimum transition error training

Yotaro Kubo*, Shinji Watanabe, Atsushi Nakamura

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

The discriminative optimization of decoding networks is important for minimizing speech recognition error. Recently, several methods have been reported that optimize decoding networks by extending weighted finite state transducer (WFST)-based decoding processes to a linear classification process. In this paper, we model decoding processes by using conditional random fields (CRFs). Since the maximum mutual information (MMI) training technique is straightforwardly applicable for CRF training, several sophisticated training methods proposed as the variants of MMI can be incorporated in our decoding network optimization. This paper adapts the boosted MMI and the differenced MMI methods for decoding network optimization so that state transition errors are minimized in WFST decoding. We evaluated the proposed methods by conducting large-vocabulary continuous speech recognition experiments. We confirmed that the CRF-based framework and transition error minimization are efficient for improving the accuracy of automatic speech recognizers.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages4197-4200
Number of pages4
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: 2012 Mar 252012 Mar 30

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Country/TerritoryJapan
CityKyoto
Period12/3/2512/3/30

Keywords

  • Automatic speech recognition
  • conditional random fields
  • transition errors
  • weighed finite-state transducers

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Decoding network optimization using minimum transition error training'. Together they form a unique fingerprint.

Cite this