Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task

Shinji Watanabe, Atsushi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.

Original languageEnglish
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages2933-2936
Number of pages4
Publication statusPublished - 2004
Externally publishedYes
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 2004 Oct 42004 Oct 8

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period04/10/404/10/8

Fingerprint

acoustics
scaling
Acoustics
Transfer of Training
experiment
performance

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Watanabe, S., & Nakamura, A. (2004). Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. In 8th International Conference on Spoken Language Processing, ICSLP 2004 (pp. 2933-2936). International Speech Communication Association.

Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. / Watanabe, Shinji; Nakamura, Atsushi.

8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, 2004. p. 2933-2936.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Watanabe, S & Nakamura, A 2004, Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. in 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, pp. 2933-2936, 8th International Conference on Spoken Language Processing, ICSLP 2004, Jeju, Jeju Island, Korea, Republic of, 04/10/4.
Watanabe S, Nakamura A. Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. In 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association. 2004. p. 2933-2936
Watanabe, Shinji ; Nakamura, Atsushi. / Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, 2004. pp. 2933-2936
@inproceedings{9fa7050e260f4a7299f5f701b98cdc6a,
title = "Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task",
abstract = "In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.",
author = "Shinji Watanabe and Atsushi Nakamura",
year = "2004",
language = "English",
pages = "2933--2936",
booktitle = "8th International Conference on Spoken Language Processing, ICSLP 2004",
publisher = "International Speech Communication Association",

}

TY - GEN

T1 - Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task

AU - Watanabe, Shinji

AU - Nakamura, Atsushi

PY - 2004

Y1 - 2004

N2 - In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.

AB - In this paper, we propose a novel adaptation technique based on coarse/fine training of transfer vectors. We focus on transfer vector estimation of a Gaussian mean from an initial model to an adapted model. The transfer vector is decomposed into a direction vector and a scaling factor. By using tied-Gaussian class (coarse class) estimation for the direction vector, and by using individual Gaussian class (fine class) estimation for the scaling factor, we can obtain accurate transfer vectors with a small number of parameters. Simple training algorithms for transfer vector estimation are analytically derived using the variational Bayes, maximum a posteriori (MAP) and maximum likelihood methods. Speaker adaptation experiments show that our proposals clearly improve speech recognition performance for any amount of adaptation data, compared with conventional MAP adaptation.

UR - http://www.scopus.com/inward/record.url?scp=85009135071&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009135071&partnerID=8YFLogxK

M3 - Conference contribution

SP - 2933

EP - 2936

BT - 8th International Conference on Spoken Language Processing, ICSLP 2004

PB - International Speech Communication Association

ER -