Automatic determination of acoustic model topology using variational Bayesian estimation and clustering

Shinji Watanabe, Atsushi Sako, Atsushi Nakamura

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.

Original languageEnglish
JournalUnknown Journal
Volume1
Publication statusPublished - 2004
Externally publishedYes

Fingerprint

topology
Acoustics
Topology
acoustics
speech recognition
Decision trees
Speech recognition
phonetics
Speech analysis

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

Automatic determination of acoustic model topology using variational Bayesian estimation and clustering. / Watanabe, Shinji; Sako, Atsushi; Nakamura, Atsushi.

In: Unknown Journal, Vol. 1, 2004.

Research output: Contribution to journalArticle

@article{66fcfe73abe546f5bb717fe5c86560b9,
title = "Automatic determination of acoustic model topology using variational Bayesian estimation and clustering",
abstract = "We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.",
author = "Shinji Watanabe and Atsushi Sako and Atsushi Nakamura",
year = "2004",
language = "English",
volume = "1",
journal = "Nuclear Physics A",
issn = "0375-9474",
publisher = "Elsevier",

}

TY - JOUR

T1 - Automatic determination of acoustic model topology using variational Bayesian estimation and clustering

AU - Watanabe, Shinji

AU - Sako, Atsushi

AU - Nakamura, Atsushi

PY - 2004

Y1 - 2004

N2 - We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.

AB - We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.

UR - http://www.scopus.com/inward/record.url?scp=4544387676&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544387676&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:4544387676

VL - 1

JO - Nuclear Physics A

JF - Nuclear Physics A

SN - 0375-9474

ER -