Model adaptation for automatic speech recognition based on multiple time scale evolution

Shinji Watanabe, Atsushi Nakamura, Biing Hwang Juang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The change in speech characteristics is originated from various factors, at various (temporal) rates in a real world conversation. These temporal changes have their own dynamics and therefore, we propose to extend the single (time-) incremental adaptations to a multiscale adaptation, which has the potential of greatly increasing the model's robustness as it will include adaptation mechanism to approximate the nature of the characteristic change. The formulation of the incremental adaptation assumes a time evolution system of the model, where the posterior distributions, used in the decision process, are successively updated based on a macroscopic time scale in accordance with the Kalman filter theory. In this paper, we extend the original incremental adaptation scheme, based on a single time scale, to multiple time scales, and apply the method to the adaptation of both the acoustic model and the language model. We further investigate methods to integrate the multi-scale adaptation scheme to realize the robust speech recognition performance. Large vocabulary continuous speech recognition experiments for English and Japanese lectures revealed the importance of modeling multiscale properties in speech recognition.

Original languageEnglish
Pages (from-to)1081-1084
Number of pages4
JournalUnknown Journal
Publication statusPublished - 2011
Externally publishedYes

Fingerprint

Multiple Time Scales
Automatic Speech Recognition
speech recognition
Speech recognition
Continuous speech recognition
Speech Recognition
Kalman filters
Model
Time Scales
Model Robustness
Robust Speech Recognition
conversation
Acoustics
Acoustic Model
Multiscale Modeling
Evolution System
Language Model
lectures
Posterior distribution
Kalman Filter

Keywords

  • Incremental adaptation
  • Multiscale
  • Speech recognition
  • Time evolution system

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Model adaptation for automatic speech recognition based on multiple time scale evolution. / Watanabe, Shinji; Nakamura, Atsushi; Juang, Biing Hwang.

In: Unknown Journal, 2011, p. 1081-1084.

Research output: Contribution to journalArticle

@article{949932234a23468592a5d61a2a54f9e4,
title = "Model adaptation for automatic speech recognition based on multiple time scale evolution",
abstract = "The change in speech characteristics is originated from various factors, at various (temporal) rates in a real world conversation. These temporal changes have their own dynamics and therefore, we propose to extend the single (time-) incremental adaptations to a multiscale adaptation, which has the potential of greatly increasing the model's robustness as it will include adaptation mechanism to approximate the nature of the characteristic change. The formulation of the incremental adaptation assumes a time evolution system of the model, where the posterior distributions, used in the decision process, are successively updated based on a macroscopic time scale in accordance with the Kalman filter theory. In this paper, we extend the original incremental adaptation scheme, based on a single time scale, to multiple time scales, and apply the method to the adaptation of both the acoustic model and the language model. We further investigate methods to integrate the multi-scale adaptation scheme to realize the robust speech recognition performance. Large vocabulary continuous speech recognition experiments for English and Japanese lectures revealed the importance of modeling multiscale properties in speech recognition.",
keywords = "Incremental adaptation, Multiscale, Speech recognition, Time evolution system",
author = "Shinji Watanabe and Atsushi Nakamura and Juang, {Biing Hwang}",
year = "2011",
language = "English",
pages = "1081--1084",
journal = "Nuclear Physics A",
issn = "0375-9474",
publisher = "Elsevier",

}

TY - JOUR

T1 - Model adaptation for automatic speech recognition based on multiple time scale evolution

AU - Watanabe, Shinji

AU - Nakamura, Atsushi

AU - Juang, Biing Hwang

PY - 2011

Y1 - 2011

N2 - The change in speech characteristics is originated from various factors, at various (temporal) rates in a real world conversation. These temporal changes have their own dynamics and therefore, we propose to extend the single (time-) incremental adaptations to a multiscale adaptation, which has the potential of greatly increasing the model's robustness as it will include adaptation mechanism to approximate the nature of the characteristic change. The formulation of the incremental adaptation assumes a time evolution system of the model, where the posterior distributions, used in the decision process, are successively updated based on a macroscopic time scale in accordance with the Kalman filter theory. In this paper, we extend the original incremental adaptation scheme, based on a single time scale, to multiple time scales, and apply the method to the adaptation of both the acoustic model and the language model. We further investigate methods to integrate the multi-scale adaptation scheme to realize the robust speech recognition performance. Large vocabulary continuous speech recognition experiments for English and Japanese lectures revealed the importance of modeling multiscale properties in speech recognition.

AB - The change in speech characteristics is originated from various factors, at various (temporal) rates in a real world conversation. These temporal changes have their own dynamics and therefore, we propose to extend the single (time-) incremental adaptations to a multiscale adaptation, which has the potential of greatly increasing the model's robustness as it will include adaptation mechanism to approximate the nature of the characteristic change. The formulation of the incremental adaptation assumes a time evolution system of the model, where the posterior distributions, used in the decision process, are successively updated based on a macroscopic time scale in accordance with the Kalman filter theory. In this paper, we extend the original incremental adaptation scheme, based on a single time scale, to multiple time scales, and apply the method to the adaptation of both the acoustic model and the language model. We further investigate methods to integrate the multi-scale adaptation scheme to realize the robust speech recognition performance. Large vocabulary continuous speech recognition experiments for English and Japanese lectures revealed the importance of modeling multiscale properties in speech recognition.

KW - Incremental adaptation

KW - Multiscale

KW - Speech recognition

KW - Time evolution system

UR - http://www.scopus.com/inward/record.url?scp=84865732431&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84865732431&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84865732431

SP - 1081

EP - 1084

JO - Nuclear Physics A

JF - Nuclear Physics A

SN - 0375-9474

ER -