Automatic determination of acoustic model topology using variational bayesian estimation and clustering for large vocabulary continuous speech recognition

Shinji Watanabe, Atsushi Sako, Atsushi Nakamura

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

We describe the automatic determination of a large and complicated acoustic model for speech recognition by using variational Bayesian estimation and clustering (VBEC) for speech recognition. We propose an efficient method for decision tree clustering based on a Gaussian mixture model (GMM) and an efficient model search algorithm for finding an appropriate acoustic model topology within the VBEC framework. GMM-based decision tree clustering for triphone HMM states features a novel approach designed to reduce the overly large number of computations to a practical level by utilizing the statistics of monophone hidden Markov model states. The model search algorithm also reduces the search space by utilizing the characteristics of the acoustic model. The experimental results confirmed that VBEC automatically and rapidly yielded an optimum model topology with the highest performance.

Original languageEnglish
Pages (from-to)855-872
Number of pages18
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume14
Issue number3
DOIs
Publication statusPublished - 2006 May
Externally publishedYes

Fingerprint

Continuous speech recognition
speech recognition
topology
Acoustics
Topology
acoustics
Decision trees
Speech recognition
Hidden Markov models
Statistics
statistics

Keywords

  • Determination of acoustic model topologies
  • Speech recognition
  • Variational bayes
  • Variational bayesian estimation and clustering (VBEC)

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

@article{3eae0b9e787c4d2da66603bf9441cbde,
title = "Automatic determination of acoustic model topology using variational bayesian estimation and clustering for large vocabulary continuous speech recognition",
abstract = "We describe the automatic determination of a large and complicated acoustic model for speech recognition by using variational Bayesian estimation and clustering (VBEC) for speech recognition. We propose an efficient method for decision tree clustering based on a Gaussian mixture model (GMM) and an efficient model search algorithm for finding an appropriate acoustic model topology within the VBEC framework. GMM-based decision tree clustering for triphone HMM states features a novel approach designed to reduce the overly large number of computations to a practical level by utilizing the statistics of monophone hidden Markov model states. The model search algorithm also reduces the search space by utilizing the characteristics of the acoustic model. The experimental results confirmed that VBEC automatically and rapidly yielded an optimum model topology with the highest performance.",
keywords = "Determination of acoustic model topologies, Speech recognition, Variational bayes, Variational bayesian estimation and clustering (VBEC)",
author = "Shinji Watanabe and Atsushi Sako and Atsushi Nakamura",
year = "2006",
month = "5",
doi = "10.1109/TSA.2005.857791",
language = "English",
volume = "14",
pages = "855--872",
journal = "IEEE Transactions on Speech and Audio Processing",
issn = "1558-7916",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Automatic determination of acoustic model topology using variational bayesian estimation and clustering for large vocabulary continuous speech recognition

AU - Watanabe, Shinji

AU - Sako, Atsushi

AU - Nakamura, Atsushi

PY - 2006/5

Y1 - 2006/5

N2 - We describe the automatic determination of a large and complicated acoustic model for speech recognition by using variational Bayesian estimation and clustering (VBEC) for speech recognition. We propose an efficient method for decision tree clustering based on a Gaussian mixture model (GMM) and an efficient model search algorithm for finding an appropriate acoustic model topology within the VBEC framework. GMM-based decision tree clustering for triphone HMM states features a novel approach designed to reduce the overly large number of computations to a practical level by utilizing the statistics of monophone hidden Markov model states. The model search algorithm also reduces the search space by utilizing the characteristics of the acoustic model. The experimental results confirmed that VBEC automatically and rapidly yielded an optimum model topology with the highest performance.

AB - We describe the automatic determination of a large and complicated acoustic model for speech recognition by using variational Bayesian estimation and clustering (VBEC) for speech recognition. We propose an efficient method for decision tree clustering based on a Gaussian mixture model (GMM) and an efficient model search algorithm for finding an appropriate acoustic model topology within the VBEC framework. GMM-based decision tree clustering for triphone HMM states features a novel approach designed to reduce the overly large number of computations to a practical level by utilizing the statistics of monophone hidden Markov model states. The model search algorithm also reduces the search space by utilizing the characteristics of the acoustic model. The experimental results confirmed that VBEC automatically and rapidly yielded an optimum model topology with the highest performance.

KW - Determination of acoustic model topologies

KW - Speech recognition

KW - Variational bayes

KW - Variational bayesian estimation and clustering (VBEC)

UR - http://www.scopus.com/inward/record.url?scp=33646418145&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646418145&partnerID=8YFLogxK

U2 - 10.1109/TSA.2005.857791

DO - 10.1109/TSA.2005.857791

M3 - Article

AN - SCOPUS:33646418145

VL - 14

SP - 855

EP - 872

JO - IEEE Transactions on Speech and Audio Processing

JF - IEEE Transactions on Speech and Audio Processing

SN - 1558-7916

IS - 3

ER -