Abstract
This paper describes minimum word error (MWE) training of recurrent neural network language models (RNNLMs) for speech recognition. RNNLMs are usually trained to minimize a cross entropy of estimated word probabilities against the correct word sequence, which corresponds to maximum likelihood criterion. However, this training does not necessarily maximize a performance measure in a target task, i.e. it does not minimize word error rate (WER) explicitly in speech recognition. To solve such a problem, several discriminative training methods have already been proposed for n-gram language models, but those for RNNLMs have not sufficiently investigated. In this paper, we propose a MWE training method for RNNLMs, and report significant WER reductions when we applied the MWE method to a standard Elman-type RNNLM and a more advanced model, a Long Short-Term Memory (LSTM) RNNLM. We also present efficient MWE training with N-best lists on Graphics Processing Units (GPUs).
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 5990-5994 |
Number of pages | 5 |
Volume | 2016-May |
ISBN (Electronic) | 9781479999880 |
DOIs | |
Publication status | Published - 2016 May 18 |
Externally published | Yes |
Event | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China Duration: 2016 Mar 20 → 2016 Mar 25 |
Other
Other | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 16/3/20 → 16/3/25 |
Keywords
- Long short-term memory
- Minimum word error training
- Recurrent neural network language model
- Speech recognition
ASJC Scopus subject areas
- Signal Processing
- Software
- Electrical and Electronic Engineering