Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task

Shinji Watanabe, Atsushi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.

Original languageEnglish
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages2933-2936
Number of pages4
Publication statusPublished - 2004
Externally publishedYes
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 2004 Oct 42004 Oct 8

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period04/10/404/10/8

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Watanabe, S., & Nakamura, A. (2004). Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. In 8th International Conference on Spoken Language Processing, ICSLP 2004 (pp. 2933-2936). International Speech Communication Association.