Segmenting acoustic signal with articulatory movement using recurrent neural network for phoneme acquisition

Hisashi Kanda, Tetsuya Ogata, Kazunori Komatani, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

This paper proposes a computational model for phoneme acquisition by infants. Human infants perceive speech sounds not as discrete phoneme sequences but as continuous acoustic signals. One of critical problems in phoneme acquisition is the design for segmenting these continuous speech sounds. The key idea to solve this problem is that articulatory mechanisms such as the vocal tract help human beings to perceive speech sound units corresponding to phonemes. That is, the ability to distinguish phonemes is learned by recognizing unstable points in the dynamics of continuous sound with articulatory movement. We have developed a vocal imitation system embodying the relationship between articulatory movements and sounds produced by the movements. To segment acoustic signal with articulatory movement, we apply the segmenting method to our system by Recurrent Neural Network with Parametric Bias (RNNPB). This method determines the multiple segmentation boundaries in a temporal sequence using the prediction error of the RNNPB model, and the PB values obtained by the method can be encoded as kind of phonemes. Our system was implemented by using a physical vocal tract model, called the Maeda model. Experimental results demonstrated that our system can self-organize the same phonemes in different continuous sounds. This suggests that our model reflects the process of phoneme acquisition.

Original languageEnglish
Title of host publication2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
Pages1712-1717
Number of pages6
DOIs
Publication statusPublished - 2008 Dec 1
Externally publishedYes
Event2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS - Nice, France
Duration: 2008 Sep 222008 Sep 26

Publication series

Name2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS

Conference

Conference2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
CountryFrance
CityNice
Period08/9/2208/9/26

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Segmenting acoustic signal with articulatory movement using recurrent neural network for phoneme acquisition'. Together they form a unique fingerprint.

  • Cite this

    Kanda, H., Ogata, T., Komatani, K., & Okuno, H. G. (2008). Segmenting acoustic signal with articulatory movement using recurrent neural network for phoneme acquisition. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (pp. 1712-1717). [4651060] (2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS). https://doi.org/10.1109/IROS.2008.4651060