Application of variational Bayesian approach to speech recognition

Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002
PublisherNeural information processing systems foundation
ISBN (Print)0262025507, 9780262025508
Publication statusPublished - 2003
Externally publishedYes
Event16th Annual Neural Information Processing Systems Conference, NIPS 2002 - Vancouver, BC, Canada
Duration: 2002 Dec 92002 Dec 14

Other

Other16th Annual Neural Information Processing Systems Conference, NIPS 2002
CountryCanada
CityVancouver, BC
Period02/12/902/12/14

Fingerprint

Model structures
Speech recognition
Maximum likelihood
Experiments

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Watanabe, S., Minami, Y., Nakamura, A., & Ueda, N. (2003). Application of variational Bayesian approach to speech recognition. In Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002 Neural information processing systems foundation.

Application of variational Bayesian approach to speech recognition. / Watanabe, Shinji; Minami, Yasuhiro; Nakamura, Atsushi; Ueda, Naonori.

Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002. Neural information processing systems foundation, 2003.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Watanabe, S, Minami, Y, Nakamura, A & Ueda, N 2003, Application of variational Bayesian approach to speech recognition. in Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002. Neural information processing systems foundation, 16th Annual Neural Information Processing Systems Conference, NIPS 2002, Vancouver, BC, Canada, 02/12/9.
Watanabe S, Minami Y, Nakamura A, Ueda N. Application of variational Bayesian approach to speech recognition. In Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002. Neural information processing systems foundation. 2003
Watanabe, Shinji ; Minami, Yasuhiro ; Nakamura, Atsushi ; Ueda, Naonori. / Application of variational Bayesian approach to speech recognition. Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002. Neural information processing systems foundation, 2003.
@inproceedings{68995be87b83481787fe0adceb02bd15,
title = "Application of variational Bayesian approach to speech recognition",
abstract = "In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.",
author = "Shinji Watanabe and Yasuhiro Minami and Atsushi Nakamura and Naonori Ueda",
year = "2003",
language = "English",
isbn = "0262025507",
booktitle = "Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002",
publisher = "Neural information processing systems foundation",

}

TY - GEN

T1 - Application of variational Bayesian approach to speech recognition

AU - Watanabe, Shinji

AU - Minami, Yasuhiro

AU - Nakamura, Atsushi

AU - Ueda, Naonori

PY - 2003

Y1 - 2003

N2 - In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.

AB - In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.

UR - http://www.scopus.com/inward/record.url?scp=79957689964&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79957689964&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0262025507

SN - 9780262025508

BT - Advances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002

PB - Neural information processing systems foundation

ER -