Vowel imitation using vocal tract model and recurrent neural network

Hisashi Kanda, Tetsuya Ogata, Kazunori Komatani, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants' vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages222-232
Number of pages11
Volume4985 LNCS
EditionPART 2
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event14th International Conference on Neural Information Processing, ICONIP 2007 - Kitakyushu
Duration: 2007 Nov 132007 Nov 16

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume4985 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other14th International Conference on Neural Information Processing, ICONIP 2007
CityKitakyushu
Period07/11/1307/11/16

Fingerprint

Imitation
Recurrent neural networks
Recurrent Neural Networks
Speech Perception
Acoustic waves
Individual Differences
Model
Motion
Physical property
Computational Model
Physical properties
Model-based
Sound
Experimental Results
Estimate
Speech

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Kanda, H., Ogata, T., Komatani, K., & Okuno, H. G. (2008). Vowel imitation using vocal tract model and recurrent neural network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 2 ed., Vol. 4985 LNCS, pp. 222-232). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4985 LNCS, No. PART 2). https://doi.org/10.1007/978-3-540-69162-4_24

Vowel imitation using vocal tract model and recurrent neural network. / Kanda, Hisashi; Ogata, Tetsuya; Komatani, Kazunori; Okuno, Hiroshi G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4985 LNCS PART 2. ed. 2008. p. 222-232 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4985 LNCS, No. PART 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kanda, H, Ogata, T, Komatani, K & Okuno, HG 2008, Vowel imitation using vocal tract model and recurrent neural network. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 edn, vol. 4985 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 2, vol. 4985 LNCS, pp. 222-232, 14th International Conference on Neural Information Processing, ICONIP 2007, Kitakyushu, 07/11/13. https://doi.org/10.1007/978-3-540-69162-4_24
Kanda H, Ogata T, Komatani K, Okuno HG. Vowel imitation using vocal tract model and recurrent neural network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 ed. Vol. 4985 LNCS. 2008. p. 222-232. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2). https://doi.org/10.1007/978-3-540-69162-4_24
Kanda, Hisashi ; Ogata, Tetsuya ; Komatani, Kazunori ; Okuno, Hiroshi G. / Vowel imitation using vocal tract model and recurrent neural network. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4985 LNCS PART 2. ed. 2008. pp. 222-232 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2).
@inproceedings{0522f94525324dbda2494fe37da44aab,
title = "Vowel imitation using vocal tract model and recurrent neural network",
abstract = "A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants' vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.",
author = "Hisashi Kanda and Tetsuya Ogata and Kazunori Komatani and Okuno, {Hiroshi G.}",
year = "2008",
doi = "10.1007/978-3-540-69162-4_24",
language = "English",
isbn = "3540691596",
volume = "4985 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 2",
pages = "222--232",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 2",

}

TY - GEN

T1 - Vowel imitation using vocal tract model and recurrent neural network

AU - Kanda, Hisashi

AU - Ogata, Tetsuya

AU - Komatani, Kazunori

AU - Okuno, Hiroshi G.

PY - 2008

Y1 - 2008

N2 - A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants' vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.

AB - A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants' vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.

UR - http://www.scopus.com/inward/record.url?scp=54049109678&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=54049109678&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-69162-4_24

DO - 10.1007/978-3-540-69162-4_24

M3 - Conference contribution

AN - SCOPUS:54049109678

SN - 3540691596

SN - 9783540691594

VL - 4985 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 222

EP - 232

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -