Sequential maximum mutual information linear discriminant analysis for speech recognition

Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Linear discriminant analysis (LDA) is a simple and effective feature transformation technique that aims to improve discriminability by maximizing the ratio of the between-class variance to the within-class variance. However, LDA does not explicitly consider the sequential discriminative criterion which consists in directly reducing the errors of a speech recognizer. This paper proposes a simple extension of LDA that is called sequential LDA (sLDA) based on a sequential discriminative criterion computed from the Gaussian statistics, which are obtained from sequential maximum mutual information (MMI) training. Although the objective function of the proposed LDA can be regarded as a special case of various discriminative feature transformation techniques (for example, f-MPE or the bottom layer of a neural network), the transformation matrix can be obtained as the closed-form solution to a generalized eigenvalue problem, in contrast to the gradient-descent-based optimization methods usually used in these techniques. Experiments on LVCSR (Corpus of Spontaneous Japanese) and noisy speech recognition task (2nd CHiME challenge) show consistent improvements from standard LDA due to the sequential discriminative training. In addition, the proposed method, despite its simple and fast computation, improved the performance in combination with discriminative feature transformation (f-bMMI), perhaps by providing a good initialization to f-bMMI.

Original languageEnglish
Pages (from-to)2415-2419
Number of pages5
JournalUnknown Journal
Publication statusPublished - 2014
Externally publishedYes

Fingerprint

speech recognition
Discriminant analysis
Speech Recognition
Discriminant Analysis
Mutual Information
Speech recognition
education
Discriminative Training
Generalized Eigenvalue Problem
Transformation Matrix
Gradient Descent
descent
Closed-form Solution
Initialization
Optimization Methods
Linear Discriminant Analysis
eigenvalues
Objective function
Statistics
statistics

Keywords

  • Linear discriminant analysis
  • Maximum mutual information
  • Region dependent linear transformation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Sequential maximum mutual information linear discriminant analysis for speech recognition. / Tachioka, Yuuki; Watanabe, Shinji; Le Roux, Jonathan; Hershey, John R.

In: Unknown Journal, 2014, p. 2415-2419.

Research output: Contribution to journalArticle

Tachioka, Yuuki ; Watanabe, Shinji ; Le Roux, Jonathan ; Hershey, John R. / Sequential maximum mutual information linear discriminant analysis for speech recognition. In: Unknown Journal. 2014 ; pp. 2415-2419.
@article{14a55c70192b48ccb4049feeaaab1e5a,
title = "Sequential maximum mutual information linear discriminant analysis for speech recognition",
abstract = "Linear discriminant analysis (LDA) is a simple and effective feature transformation technique that aims to improve discriminability by maximizing the ratio of the between-class variance to the within-class variance. However, LDA does not explicitly consider the sequential discriminative criterion which consists in directly reducing the errors of a speech recognizer. This paper proposes a simple extension of LDA that is called sequential LDA (sLDA) based on a sequential discriminative criterion computed from the Gaussian statistics, which are obtained from sequential maximum mutual information (MMI) training. Although the objective function of the proposed LDA can be regarded as a special case of various discriminative feature transformation techniques (for example, f-MPE or the bottom layer of a neural network), the transformation matrix can be obtained as the closed-form solution to a generalized eigenvalue problem, in contrast to the gradient-descent-based optimization methods usually used in these techniques. Experiments on LVCSR (Corpus of Spontaneous Japanese) and noisy speech recognition task (2nd CHiME challenge) show consistent improvements from standard LDA due to the sequential discriminative training. In addition, the proposed method, despite its simple and fast computation, improved the performance in combination with discriminative feature transformation (f-bMMI), perhaps by providing a good initialization to f-bMMI.",
keywords = "Linear discriminant analysis, Maximum mutual information, Region dependent linear transformation",
author = "Yuuki Tachioka and Shinji Watanabe and {Le Roux}, Jonathan and Hershey, {John R.}",
year = "2014",
language = "English",
pages = "2415--2419",
journal = "Nuclear Physics A",
issn = "0375-9474",
publisher = "Elsevier",

}

TY - JOUR

T1 - Sequential maximum mutual information linear discriminant analysis for speech recognition

AU - Tachioka, Yuuki

AU - Watanabe, Shinji

AU - Le Roux, Jonathan

AU - Hershey, John R.

PY - 2014

Y1 - 2014

N2 - Linear discriminant analysis (LDA) is a simple and effective feature transformation technique that aims to improve discriminability by maximizing the ratio of the between-class variance to the within-class variance. However, LDA does not explicitly consider the sequential discriminative criterion which consists in directly reducing the errors of a speech recognizer. This paper proposes a simple extension of LDA that is called sequential LDA (sLDA) based on a sequential discriminative criterion computed from the Gaussian statistics, which are obtained from sequential maximum mutual information (MMI) training. Although the objective function of the proposed LDA can be regarded as a special case of various discriminative feature transformation techniques (for example, f-MPE or the bottom layer of a neural network), the transformation matrix can be obtained as the closed-form solution to a generalized eigenvalue problem, in contrast to the gradient-descent-based optimization methods usually used in these techniques. Experiments on LVCSR (Corpus of Spontaneous Japanese) and noisy speech recognition task (2nd CHiME challenge) show consistent improvements from standard LDA due to the sequential discriminative training. In addition, the proposed method, despite its simple and fast computation, improved the performance in combination with discriminative feature transformation (f-bMMI), perhaps by providing a good initialization to f-bMMI.

AB - Linear discriminant analysis (LDA) is a simple and effective feature transformation technique that aims to improve discriminability by maximizing the ratio of the between-class variance to the within-class variance. However, LDA does not explicitly consider the sequential discriminative criterion which consists in directly reducing the errors of a speech recognizer. This paper proposes a simple extension of LDA that is called sequential LDA (sLDA) based on a sequential discriminative criterion computed from the Gaussian statistics, which are obtained from sequential maximum mutual information (MMI) training. Although the objective function of the proposed LDA can be regarded as a special case of various discriminative feature transformation techniques (for example, f-MPE or the bottom layer of a neural network), the transformation matrix can be obtained as the closed-form solution to a generalized eigenvalue problem, in contrast to the gradient-descent-based optimization methods usually used in these techniques. Experiments on LVCSR (Corpus of Spontaneous Japanese) and noisy speech recognition task (2nd CHiME challenge) show consistent improvements from standard LDA due to the sequential discriminative training. In addition, the proposed method, despite its simple and fast computation, improved the performance in combination with discriminative feature transformation (f-bMMI), perhaps by providing a good initialization to f-bMMI.

KW - Linear discriminant analysis

KW - Maximum mutual information

KW - Region dependent linear transformation

UR - http://www.scopus.com/inward/record.url?scp=84910089514&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84910089514&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84910089514

SP - 2415

EP - 2419

JO - Nuclear Physics A

JF - Nuclear Physics A

SN - 0375-9474

ER -