Discriminative feature transforms using differenced maximum mutual information

Marc Delcroix, Atsunori Ogawa, Shinji Watanabe, Tomohiro Nakatani, Atsushi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Recently feature compensation techniques that train feature transforms using a discriminative criterion have attracted much interest in the speech recognition community. Typically, the acoustic feature space is modeled by a Gaussian mixture model (GMM), and a feature transform is assigned to each Gaussian of the GMM. Feature compensation is then performed by transforming features using the transformation associated with each Gaussian, then summing up the transformed features weighted by the posterior probability of each Gaussian. Several discriminative criteria have been investigated for estimating the feature transformation parameters including maximum mutual information (MMI) and minimum phone error (MPE). Recently, the differenced MMI (dMMI) criterion that generalizes MMI andMPE, has been shown to provide competitive performance for acoustic model training. In this paper, we investigate the use of the dMMI criterion for discriminative feature transforms and demonstrate in a noisy speech recognition experiment that dMMI achieves recognition performance superior to that of MMI or MPE.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages4753-4756
Number of pages4
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto
Duration: 2012 Mar 252012 Mar 30

Other

Other2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
CityKyoto
Period12/3/2512/3/30

Fingerprint

Speech recognition
Acoustics
Experiments
Compensation and Redress

Keywords

  • differenced MMI
  • discriminative feature transforms
  • discriminative training
  • Speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Delcroix, M., Ogawa, A., Watanabe, S., Nakatani, T., & Nakamura, A. (2012). Discriminative feature transforms using differenced maximum mutual information. In 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings (pp. 4753-4756). [6288981] https://doi.org/10.1109/ICASSP.2012.6288981

Discriminative feature transforms using differenced maximum mutual information. / Delcroix, Marc; Ogawa, Atsunori; Watanabe, Shinji; Nakatani, Tomohiro; Nakamura, Atsushi.

2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. p. 4753-4756 6288981.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Delcroix, M, Ogawa, A, Watanabe, S, Nakatani, T & Nakamura, A 2012, Discriminative feature transforms using differenced maximum mutual information. in 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings., 6288981, pp. 4753-4756, 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012, Kyoto, 12/3/25. https://doi.org/10.1109/ICASSP.2012.6288981
Delcroix M, Ogawa A, Watanabe S, Nakatani T, Nakamura A. Discriminative feature transforms using differenced maximum mutual information. In 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. p. 4753-4756. 6288981 https://doi.org/10.1109/ICASSP.2012.6288981
Delcroix, Marc ; Ogawa, Atsunori ; Watanabe, Shinji ; Nakatani, Tomohiro ; Nakamura, Atsushi. / Discriminative feature transforms using differenced maximum mutual information. 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings. 2012. pp. 4753-4756
@inproceedings{e3ce2e4c01434bc3818ad0175366f5d7,
title = "Discriminative feature transforms using differenced maximum mutual information",
abstract = "Recently feature compensation techniques that train feature transforms using a discriminative criterion have attracted much interest in the speech recognition community. Typically, the acoustic feature space is modeled by a Gaussian mixture model (GMM), and a feature transform is assigned to each Gaussian of the GMM. Feature compensation is then performed by transforming features using the transformation associated with each Gaussian, then summing up the transformed features weighted by the posterior probability of each Gaussian. Several discriminative criteria have been investigated for estimating the feature transformation parameters including maximum mutual information (MMI) and minimum phone error (MPE). Recently, the differenced MMI (dMMI) criterion that generalizes MMI andMPE, has been shown to provide competitive performance for acoustic model training. In this paper, we investigate the use of the dMMI criterion for discriminative feature transforms and demonstrate in a noisy speech recognition experiment that dMMI achieves recognition performance superior to that of MMI or MPE.",
keywords = "differenced MMI, discriminative feature transforms, discriminative training, Speech recognition",
author = "Marc Delcroix and Atsunori Ogawa and Shinji Watanabe and Tomohiro Nakatani and Atsushi Nakamura",
year = "2012",
doi = "10.1109/ICASSP.2012.6288981",
language = "English",
isbn = "9781467300469",
pages = "4753--4756",
booktitle = "2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings",

}

TY - GEN

T1 - Discriminative feature transforms using differenced maximum mutual information

AU - Delcroix, Marc

AU - Ogawa, Atsunori

AU - Watanabe, Shinji

AU - Nakatani, Tomohiro

AU - Nakamura, Atsushi

PY - 2012

Y1 - 2012

N2 - Recently feature compensation techniques that train feature transforms using a discriminative criterion have attracted much interest in the speech recognition community. Typically, the acoustic feature space is modeled by a Gaussian mixture model (GMM), and a feature transform is assigned to each Gaussian of the GMM. Feature compensation is then performed by transforming features using the transformation associated with each Gaussian, then summing up the transformed features weighted by the posterior probability of each Gaussian. Several discriminative criteria have been investigated for estimating the feature transformation parameters including maximum mutual information (MMI) and minimum phone error (MPE). Recently, the differenced MMI (dMMI) criterion that generalizes MMI andMPE, has been shown to provide competitive performance for acoustic model training. In this paper, we investigate the use of the dMMI criterion for discriminative feature transforms and demonstrate in a noisy speech recognition experiment that dMMI achieves recognition performance superior to that of MMI or MPE.

AB - Recently feature compensation techniques that train feature transforms using a discriminative criterion have attracted much interest in the speech recognition community. Typically, the acoustic feature space is modeled by a Gaussian mixture model (GMM), and a feature transform is assigned to each Gaussian of the GMM. Feature compensation is then performed by transforming features using the transformation associated with each Gaussian, then summing up the transformed features weighted by the posterior probability of each Gaussian. Several discriminative criteria have been investigated for estimating the feature transformation parameters including maximum mutual information (MMI) and minimum phone error (MPE). Recently, the differenced MMI (dMMI) criterion that generalizes MMI andMPE, has been shown to provide competitive performance for acoustic model training. In this paper, we investigate the use of the dMMI criterion for discriminative feature transforms and demonstrate in a noisy speech recognition experiment that dMMI achieves recognition performance superior to that of MMI or MPE.

KW - differenced MMI

KW - discriminative feature transforms

KW - discriminative training

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84867593229&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867593229&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2012.6288981

DO - 10.1109/ICASSP.2012.6288981

M3 - Conference contribution

AN - SCOPUS:84867593229

SN - 9781467300469

SP - 4753

EP - 4756

BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings

ER -