Predictor—Corrector Adaptation by Using Time Evolution System With Macroscopic Time Scale

Shinji Watanabe, Atsushi Nakamura

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models to time-variant acoustic characteristics related to such factors as changes of speaker, speaking style, and noise source over time. In this paper, we propose a novel incremental adaptation framework, which models such time-variant characteristics by successively updating posterior distributions of acoustic model parameters based on a macroscopic time scale (e.g., every set of more than a dozen utterances). The proposed incremental update involves a predictor-corrector algorithm based on a macroscopic time evolution system in accordance with the Kalman filter theory. We also provide a unified interpretation of the proposal and the two major conventional approaches of indirect adaptation via transformation parameters [e.g., maximum-likelihood linear regression (MLLR)] and direct adaptation of classifier parameters [e.g., maximum a posteriori (MAP)]. We reveal analytically and experimentally that the proposed incremental adaptation realizes the predictor-corrector algorithm and involves both the conventional and their combinatorial adaptation approaches. Consequently, the proposal achieves robust recognition performance based on a balanced incremental adaptation between quickness and stability.

Original languageEnglish
Pages (from-to)395-406
Number of pages12
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume18
Issue number2
DOIs
Publication statusPublished - 2010
Externally publishedYes

Fingerprint

Acoustics
Speech recognition
Linear regression
Kalman filters
Maximum likelihood
Classifiers
acoustics
proposals
speech recognition
classifiers
predictions
regression analysis
adjusting

Keywords

  • Acoustic model
  • incremental adaptation
  • macroscopic time evolution
  • predictor-corrector algorithm
  • speech recognition

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Cite this

Predictor—Corrector Adaptation by Using Time Evolution System With Macroscopic Time Scale. / Watanabe, Shinji; Nakamura, Atsushi.

In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, No. 2, 2010, p. 395-406.

Research output: Contribution to journalArticle

@article{e2f96082a48c47fca1358cb3239c016b,
title = "Predictor—Corrector Adaptation by Using Time Evolution System With Macroscopic Time Scale",
abstract = "Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models to time-variant acoustic characteristics related to such factors as changes of speaker, speaking style, and noise source over time. In this paper, we propose a novel incremental adaptation framework, which models such time-variant characteristics by successively updating posterior distributions of acoustic model parameters based on a macroscopic time scale (e.g., every set of more than a dozen utterances). The proposed incremental update involves a predictor-corrector algorithm based on a macroscopic time evolution system in accordance with the Kalman filter theory. We also provide a unified interpretation of the proposal and the two major conventional approaches of indirect adaptation via transformation parameters [e.g., maximum-likelihood linear regression (MLLR)] and direct adaptation of classifier parameters [e.g., maximum a posteriori (MAP)]. We reveal analytically and experimentally that the proposed incremental adaptation realizes the predictor-corrector algorithm and involves both the conventional and their combinatorial adaptation approaches. Consequently, the proposal achieves robust recognition performance based on a balanced incremental adaptation between quickness and stability.",
keywords = "Acoustic model, incremental adaptation, macroscopic time evolution, predictor-corrector algorithm, speech recognition",
author = "Shinji Watanabe and Atsushi Nakamura",
year = "2010",
doi = "10.1109/TASL.2009.2029717",
language = "English",
volume = "18",
pages = "395--406",
journal = "IEEE Transactions on Speech and Audio Processing",
issn = "1558-7916",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - Predictor—Corrector Adaptation by Using Time Evolution System With Macroscopic Time Scale

AU - Watanabe, Shinji

AU - Nakamura, Atsushi

PY - 2010

Y1 - 2010

N2 - Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models to time-variant acoustic characteristics related to such factors as changes of speaker, speaking style, and noise source over time. In this paper, we propose a novel incremental adaptation framework, which models such time-variant characteristics by successively updating posterior distributions of acoustic model parameters based on a macroscopic time scale (e.g., every set of more than a dozen utterances). The proposed incremental update involves a predictor-corrector algorithm based on a macroscopic time evolution system in accordance with the Kalman filter theory. We also provide a unified interpretation of the proposal and the two major conventional approaches of indirect adaptation via transformation parameters [e.g., maximum-likelihood linear regression (MLLR)] and direct adaptation of classifier parameters [e.g., maximum a posteriori (MAP)]. We reveal analytically and experimentally that the proposed incremental adaptation realizes the predictor-corrector algorithm and involves both the conventional and their combinatorial adaptation approaches. Consequently, the proposal achieves robust recognition performance based on a balanced incremental adaptation between quickness and stability.

AB - Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models to time-variant acoustic characteristics related to such factors as changes of speaker, speaking style, and noise source over time. In this paper, we propose a novel incremental adaptation framework, which models such time-variant characteristics by successively updating posterior distributions of acoustic model parameters based on a macroscopic time scale (e.g., every set of more than a dozen utterances). The proposed incremental update involves a predictor-corrector algorithm based on a macroscopic time evolution system in accordance with the Kalman filter theory. We also provide a unified interpretation of the proposal and the two major conventional approaches of indirect adaptation via transformation parameters [e.g., maximum-likelihood linear regression (MLLR)] and direct adaptation of classifier parameters [e.g., maximum a posteriori (MAP)]. We reveal analytically and experimentally that the proposed incremental adaptation realizes the predictor-corrector algorithm and involves both the conventional and their combinatorial adaptation approaches. Consequently, the proposal achieves robust recognition performance based on a balanced incremental adaptation between quickness and stability.

KW - Acoustic model

KW - incremental adaptation

KW - macroscopic time evolution

KW - predictor-corrector algorithm

KW - speech recognition

UR - http://www.scopus.com/inward/record.url?scp=85008538758&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85008538758&partnerID=8YFLogxK

U2 - 10.1109/TASL.2009.2029717

DO - 10.1109/TASL.2009.2029717

M3 - Article

VL - 18

SP - 395

EP - 406

JO - IEEE Transactions on Speech and Audio Processing

JF - IEEE Transactions on Speech and Audio Processing

SN - 1558-7916

IS - 2

ER -