Discriminative method for recurrent neural network language models

Yuuki Tachioka, Shinji Watanabe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

A recurrent neural network language model (RNN-LM) can use a long word context more than can an n-gram language model, and its effective has recently been shown in its accomplishment of automatic speech recognition (ASR) tasks. However, the training criteria of RNN-LM are based on cross entropy (CE) between predicted and reference words. In addition, unlike the discriminative training of acoustic models and discriminative language models (DLM), these criteria do not explicitly consider discriminative criteria calculated from ASR hypotheses and references. This paper proposes a discriminative training method for RNN-LM by additionally considering a discriminative criterion to CE. We use the log-likelihood ratio of the ASR hypotheses and references as an discriminative criterion. The proposed training criterion emphasizes the effect of improperly recognized words relatively compared to the effect of correct words, which are discounted in training. Experiments on a large vocabulary continuous speech recognition task show that our proposed method improves the RNN-LM baseline. In addition, combining the proposed discriminative RNN-LM and DLM further shows its effectiveness.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5386-5390
Number of pages5
Volume2015-August
ISBN (Electronic)9781467369978
DOIs
Publication statusPublished - 2015 Aug 4
Externally publishedYes
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: 2014 Apr 192014 Apr 24

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
CountryAustralia
CityBrisbane
Period14/4/1914/4/24

Fingerprint

Recurrent neural networks
Speech recognition
Entropy
Continuous speech recognition
Acoustics

Keywords

  • discriminative criterion
  • language model
  • log-likelihood ratio
  • recurrent neural network
  • Speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Tachioka, Y., & Watanabe, S. (2015). Discriminative method for recurrent neural network language models. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings (Vol. 2015-August, pp. 5386-5390). [7179000] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7179000

Discriminative method for recurrent neural network language models. / Tachioka, Yuuki; Watanabe, Shinji.

2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. p. 5386-5390 7179000.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tachioka, Y & Watanabe, S 2015, Discriminative method for recurrent neural network language models. in 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. vol. 2015-August, 7179000, Institute of Electrical and Electronics Engineers Inc., pp. 5386-5390, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 14/4/19. https://doi.org/10.1109/ICASSP.2015.7179000
Tachioka Y, Watanabe S. Discriminative method for recurrent neural network language models. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August. Institute of Electrical and Electronics Engineers Inc. 2015. p. 5386-5390. 7179000 https://doi.org/10.1109/ICASSP.2015.7179000
Tachioka, Yuuki ; Watanabe, Shinji. / Discriminative method for recurrent neural network language models. 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. pp. 5386-5390
@inproceedings{81689748242b41829e39702a4f1171ef,
title = "Discriminative method for recurrent neural network language models",
abstract = "A recurrent neural network language model (RNN-LM) can use a long word context more than can an n-gram language model, and its effective has recently been shown in its accomplishment of automatic speech recognition (ASR) tasks. However, the training criteria of RNN-LM are based on cross entropy (CE) between predicted and reference words. In addition, unlike the discriminative training of acoustic models and discriminative language models (DLM), these criteria do not explicitly consider discriminative criteria calculated from ASR hypotheses and references. This paper proposes a discriminative training method for RNN-LM by additionally considering a discriminative criterion to CE. We use the log-likelihood ratio of the ASR hypotheses and references as an discriminative criterion. The proposed training criterion emphasizes the effect of improperly recognized words relatively compared to the effect of correct words, which are discounted in training. Experiments on a large vocabulary continuous speech recognition task show that our proposed method improves the RNN-LM baseline. In addition, combining the proposed discriminative RNN-LM and DLM further shows its effectiveness.",
keywords = "discriminative criterion, language model, log-likelihood ratio, recurrent neural network, Speech recognition",
author = "Yuuki Tachioka and Shinji Watanabe",
year = "2015",
month = "8",
day = "4",
doi = "10.1109/ICASSP.2015.7179000",
language = "English",
volume = "2015-August",
pages = "5386--5390",
booktitle = "2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Discriminative method for recurrent neural network language models

AU - Tachioka, Yuuki

AU - Watanabe, Shinji

PY - 2015/8/4

Y1 - 2015/8/4

N2 - A recurrent neural network language model (RNN-LM) can use a long word context more than can an n-gram language model, and its effective has recently been shown in its accomplishment of automatic speech recognition (ASR) tasks. However, the training criteria of RNN-LM are based on cross entropy (CE) between predicted and reference words. In addition, unlike the discriminative training of acoustic models and discriminative language models (DLM), these criteria do not explicitly consider discriminative criteria calculated from ASR hypotheses and references. This paper proposes a discriminative training method for RNN-LM by additionally considering a discriminative criterion to CE. We use the log-likelihood ratio of the ASR hypotheses and references as an discriminative criterion. The proposed training criterion emphasizes the effect of improperly recognized words relatively compared to the effect of correct words, which are discounted in training. Experiments on a large vocabulary continuous speech recognition task show that our proposed method improves the RNN-LM baseline. In addition, combining the proposed discriminative RNN-LM and DLM further shows its effectiveness.

AB - A recurrent neural network language model (RNN-LM) can use a long word context more than can an n-gram language model, and its effective has recently been shown in its accomplishment of automatic speech recognition (ASR) tasks. However, the training criteria of RNN-LM are based on cross entropy (CE) between predicted and reference words. In addition, unlike the discriminative training of acoustic models and discriminative language models (DLM), these criteria do not explicitly consider discriminative criteria calculated from ASR hypotheses and references. This paper proposes a discriminative training method for RNN-LM by additionally considering a discriminative criterion to CE. We use the log-likelihood ratio of the ASR hypotheses and references as an discriminative criterion. The proposed training criterion emphasizes the effect of improperly recognized words relatively compared to the effect of correct words, which are discounted in training. Experiments on a large vocabulary continuous speech recognition task show that our proposed method improves the RNN-LM baseline. In addition, combining the proposed discriminative RNN-LM and DLM further shows its effectiveness.

KW - discriminative criterion

KW - language model

KW - log-likelihood ratio

KW - recurrent neural network

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84946032475&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946032475&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2015.7179000

DO - 10.1109/ICASSP.2015.7179000

M3 - Conference contribution

VL - 2015-August

SP - 5386

EP - 5390

BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -