Acoustic-to-articulatory inverse mapping using an HMM-based speech production model

Sadao Hiroya, Masaaki Honda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

We present a method that determines articulatory movements from speech acoustics using an HMM (Hidden Markov Model)-based speech production model. The model statistically generates speech acoustics and articulatory movements from a given phonemic string. It consists of HMMs of articulatory movements for each phoneme and an articulatory-to-acoustic mapping for each HMM state. For a given speech acoustics, the maximum a posteriori probability estimate of the articulatory parameters of the statistical model is presented. The method's performance on sentences was evaluated by comparing the estimated articulatory parameters with observed parameters. The average rms error of the estimated articulatory parameters was 1.79 mm with phonemic information and 2.16 mm without phonemic information in an utterance.

Original languageEnglish
Title of host publication7th International Conference on Spoken Language Processing, ICSLP 2002
PublisherInternational Speech Communication Association
Pages2305-2308
Number of pages4
Publication statusPublished - 2002
Externally publishedYes
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: 2002 Sep 162002 Sep 20

Other

Other7th International Conference on Spoken Language Processing, ICSLP 2002
CountryUnited States
CityDenver
Period02/9/1602/9/20

Fingerprint

acoustics
Acoustics
Speech Production
Hidden Markov Model
Speech Acoustics
Phonemics
performance

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Hiroya, S., & Honda, M. (2002). Acoustic-to-articulatory inverse mapping using an HMM-based speech production model. In 7th International Conference on Spoken Language Processing, ICSLP 2002 (pp. 2305-2308). International Speech Communication Association.

Acoustic-to-articulatory inverse mapping using an HMM-based speech production model. / Hiroya, Sadao; Honda, Masaaki.

7th International Conference on Spoken Language Processing, ICSLP 2002. International Speech Communication Association, 2002. p. 2305-2308.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hiroya, S & Honda, M 2002, Acoustic-to-articulatory inverse mapping using an HMM-based speech production model. in 7th International Conference on Spoken Language Processing, ICSLP 2002. International Speech Communication Association, pp. 2305-2308, 7th International Conference on Spoken Language Processing, ICSLP 2002, Denver, United States, 02/9/16.
Hiroya S, Honda M. Acoustic-to-articulatory inverse mapping using an HMM-based speech production model. In 7th International Conference on Spoken Language Processing, ICSLP 2002. International Speech Communication Association. 2002. p. 2305-2308
Hiroya, Sadao ; Honda, Masaaki. / Acoustic-to-articulatory inverse mapping using an HMM-based speech production model. 7th International Conference on Spoken Language Processing, ICSLP 2002. International Speech Communication Association, 2002. pp. 2305-2308
@inproceedings{e53b49f174c9473f822bb25f6f4cc875,
title = "Acoustic-to-articulatory inverse mapping using an HMM-based speech production model",
abstract = "We present a method that determines articulatory movements from speech acoustics using an HMM (Hidden Markov Model)-based speech production model. The model statistically generates speech acoustics and articulatory movements from a given phonemic string. It consists of HMMs of articulatory movements for each phoneme and an articulatory-to-acoustic mapping for each HMM state. For a given speech acoustics, the maximum a posteriori probability estimate of the articulatory parameters of the statistical model is presented. The method's performance on sentences was evaluated by comparing the estimated articulatory parameters with observed parameters. The average rms error of the estimated articulatory parameters was 1.79 mm with phonemic information and 2.16 mm without phonemic information in an utterance.",
author = "Sadao Hiroya and Masaaki Honda",
year = "2002",
language = "English",
pages = "2305--2308",
booktitle = "7th International Conference on Spoken Language Processing, ICSLP 2002",
publisher = "International Speech Communication Association",

}

TY - GEN

T1 - Acoustic-to-articulatory inverse mapping using an HMM-based speech production model

AU - Hiroya, Sadao

AU - Honda, Masaaki

PY - 2002

Y1 - 2002

N2 - We present a method that determines articulatory movements from speech acoustics using an HMM (Hidden Markov Model)-based speech production model. The model statistically generates speech acoustics and articulatory movements from a given phonemic string. It consists of HMMs of articulatory movements for each phoneme and an articulatory-to-acoustic mapping for each HMM state. For a given speech acoustics, the maximum a posteriori probability estimate of the articulatory parameters of the statistical model is presented. The method's performance on sentences was evaluated by comparing the estimated articulatory parameters with observed parameters. The average rms error of the estimated articulatory parameters was 1.79 mm with phonemic information and 2.16 mm without phonemic information in an utterance.

AB - We present a method that determines articulatory movements from speech acoustics using an HMM (Hidden Markov Model)-based speech production model. The model statistically generates speech acoustics and articulatory movements from a given phonemic string. It consists of HMMs of articulatory movements for each phoneme and an articulatory-to-acoustic mapping for each HMM state. For a given speech acoustics, the maximum a posteriori probability estimate of the articulatory parameters of the statistical model is presented. The method's performance on sentences was evaluated by comparing the estimated articulatory parameters with observed parameters. The average rms error of the estimated articulatory parameters was 1.79 mm with phonemic information and 2.16 mm without phonemic information in an utterance.

UR - http://www.scopus.com/inward/record.url?scp=85009243663&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009243663&partnerID=8YFLogxK

M3 - Conference contribution

SP - 2305

EP - 2308

BT - 7th International Conference on Spoken Language Processing, ICSLP 2002

PB - International Speech Communication Association

ER -