Phoneme acquisition model based on vowel imitation using recurrent neural network

Hisashi Kanda, Tetsuya Ogata, Toru Takahashi, Kazunori Komatani, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces infant vocalization. Therefore, we hypothesized that infants can acquire phonemes to imitate their caregivers' voices by trial and error, i. e., infants use self-vocalization experience to search for imitable and unimitable elements in their caregivers' voices. On the basis of this hypothesis, we constructed a phoneme acquisition process using interaction involving vowel imitation between a human and an infant model. Our infant model had a vocal tract system, called the Maeda model, and an auditory system implemented by using Mel-Frequency Cepstral Coefficients (MFCCs) through STRAIGHT analysis. We applied Recurrent Neural Network with Parametric Bias (RNNPB) to learn the experience of self-vocalization, to recognize the human voice, and to produce the sound imitated by the infant model. To evaluate imitable and unimitable sounds, we used the prediction error of the RNNPB model. The experimental results revealed that as imitation interactions were repeated, the formants of sounds imitated by our system moved closer to those of human voices, and our system could self-organize the same vowels in different continuous sounds. This suggests that our system can reflect the process of phoneme acquisition.

Original languageEnglish
Title of host publication2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
Pages5388-5393
Number of pages6
DOIs
Publication statusPublished - 2009 Dec 11
Externally publishedYes
Event2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009 - St. Louis, MO
Duration: 2009 Oct 112009 Oct 15

Other

Other2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
CitySt. Louis, MO
Period09/10/1109/10/15

Fingerprint

Recurrent neural networks
Acoustic waves

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Control and Systems Engineering

Cite this

Kanda, H., Ogata, T., Takahashi, T., Komatani, K., & Okuno, H. G. (2009). Phoneme acquisition model based on vowel imitation using recurrent neural network. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009 (pp. 5388-5393). [5354825] https://doi.org/10.1109/IROS.2009.5354825

Phoneme acquisition model based on vowel imitation using recurrent neural network. / Kanda, Hisashi; Ogata, Tetsuya; Takahashi, Toru; Komatani, Kazunori; Okuno, Hiroshi G.

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009. 2009. p. 5388-5393 5354825.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kanda, H, Ogata, T, Takahashi, T, Komatani, K & Okuno, HG 2009, Phoneme acquisition model based on vowel imitation using recurrent neural network. in 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009., 5354825, pp. 5388-5393, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009, St. Louis, MO, 09/10/11. https://doi.org/10.1109/IROS.2009.5354825
Kanda H, Ogata T, Takahashi T, Komatani K, Okuno HG. Phoneme acquisition model based on vowel imitation using recurrent neural network. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009. 2009. p. 5388-5393. 5354825 https://doi.org/10.1109/IROS.2009.5354825
Kanda, Hisashi ; Ogata, Tetsuya ; Takahashi, Toru ; Komatani, Kazunori ; Okuno, Hiroshi G. / Phoneme acquisition model based on vowel imitation using recurrent neural network. 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009. 2009. pp. 5388-5393
@inproceedings{2e708a70ec064ff3ad9973f064eca221,
title = "Phoneme acquisition model based on vowel imitation using recurrent neural network",
abstract = "A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces infant vocalization. Therefore, we hypothesized that infants can acquire phonemes to imitate their caregivers' voices by trial and error, i. e., infants use self-vocalization experience to search for imitable and unimitable elements in their caregivers' voices. On the basis of this hypothesis, we constructed a phoneme acquisition process using interaction involving vowel imitation between a human and an infant model. Our infant model had a vocal tract system, called the Maeda model, and an auditory system implemented by using Mel-Frequency Cepstral Coefficients (MFCCs) through STRAIGHT analysis. We applied Recurrent Neural Network with Parametric Bias (RNNPB) to learn the experience of self-vocalization, to recognize the human voice, and to produce the sound imitated by the infant model. To evaluate imitable and unimitable sounds, we used the prediction error of the RNNPB model. The experimental results revealed that as imitation interactions were repeated, the formants of sounds imitated by our system moved closer to those of human voices, and our system could self-organize the same vowels in different continuous sounds. This suggests that our system can reflect the process of phoneme acquisition.",
author = "Hisashi Kanda and Tetsuya Ogata and Toru Takahashi and Kazunori Komatani and Okuno, {Hiroshi G.}",
year = "2009",
month = "12",
day = "11",
doi = "10.1109/IROS.2009.5354825",
language = "English",
isbn = "9781424438044",
pages = "5388--5393",
booktitle = "2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009",

}

TY - GEN

T1 - Phoneme acquisition model based on vowel imitation using recurrent neural network

AU - Kanda, Hisashi

AU - Ogata, Tetsuya

AU - Takahashi, Toru

AU - Komatani, Kazunori

AU - Okuno, Hiroshi G.

PY - 2009/12/11

Y1 - 2009/12/11

N2 - A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces infant vocalization. Therefore, we hypothesized that infants can acquire phonemes to imitate their caregivers' voices by trial and error, i. e., infants use self-vocalization experience to search for imitable and unimitable elements in their caregivers' voices. On the basis of this hypothesis, we constructed a phoneme acquisition process using interaction involving vowel imitation between a human and an infant model. Our infant model had a vocal tract system, called the Maeda model, and an auditory system implemented by using Mel-Frequency Cepstral Coefficients (MFCCs) through STRAIGHT analysis. We applied Recurrent Neural Network with Parametric Bias (RNNPB) to learn the experience of self-vocalization, to recognize the human voice, and to produce the sound imitated by the infant model. To evaluate imitable and unimitable sounds, we used the prediction error of the RNNPB model. The experimental results revealed that as imitation interactions were repeated, the formants of sounds imitated by our system moved closer to those of human voices, and our system could self-organize the same vowels in different continuous sounds. This suggests that our system can reflect the process of phoneme acquisition.

AB - A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces infant vocalization. Therefore, we hypothesized that infants can acquire phonemes to imitate their caregivers' voices by trial and error, i. e., infants use self-vocalization experience to search for imitable and unimitable elements in their caregivers' voices. On the basis of this hypothesis, we constructed a phoneme acquisition process using interaction involving vowel imitation between a human and an infant model. Our infant model had a vocal tract system, called the Maeda model, and an auditory system implemented by using Mel-Frequency Cepstral Coefficients (MFCCs) through STRAIGHT analysis. We applied Recurrent Neural Network with Parametric Bias (RNNPB) to learn the experience of self-vocalization, to recognize the human voice, and to produce the sound imitated by the infant model. To evaluate imitable and unimitable sounds, we used the prediction error of the RNNPB model. The experimental results revealed that as imitation interactions were repeated, the formants of sounds imitated by our system moved closer to those of human voices, and our system could self-organize the same vowels in different continuous sounds. This suggests that our system can reflect the process of phoneme acquisition.

UR - http://www.scopus.com/inward/record.url?scp=76249093887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=76249093887&partnerID=8YFLogxK

U2 - 10.1109/IROS.2009.5354825

DO - 10.1109/IROS.2009.5354825

M3 - Conference contribution

AN - SCOPUS:76249093887

SN - 9781424438044

SP - 5388

EP - 5393

BT - 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009

ER -