The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features

Miyazawa Kouki, Miura Hideaki, Hideaki Kikuchi, Mazuka Reiko

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    It is unclear as to how infants learn the acoustic expression of each phoneme of their native languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However, these studies have used a limited vocabulary as input and do not handle a continuous speech that is almost comparable to a natural environment. Therefore, we use a natural continuous speech and build a self-organization model that simulates the cognitive ability of the humans, and we analyze the quality and quantity of the speech information that is necessary for the acquisition of the native phoneme system. Our model is designed to learn values of the acoustic features of a continuous speech and to estimate the number and boundaries of the phoneme categories without using explicit instructions. In a recent study, our model could acquire the detailed vowels of the input language. In this study, we examined the mechanism necessary for an infant to acquire all the phonemes of a language, including consonants. In natural speech, vowels have a stationary feature; hence, our recent model is suitable for learning them. However, learning consonants through the past model is difficult because most consonants have more dynamic features than vowels. To solve this problem, we designed a method to separate "stable" and "dynamic" speech patterns using a feature-extraction method based on the auditory expressions used by human beings. Using this method, we showed that the acquisition of an unstable phoneme was possible without the use of instructions.

    Original languageEnglish
    Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
    Pages749-752
    Number of pages4
    Publication statusPublished - 2011
    Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
    Duration: 2011 Aug 272011 Aug 31

    Other

    Other12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011
    CountryItaly
    CityFlorence
    Period11/8/2711/8/31

    Fingerprint

    Self-organizing
    Time Scales
    Acoustics
    Model
    Necessary
    Self-organization
    Computational Model
    Feature Extraction
    Speech
    Acquisition
    Phoneme
    Organizing
    Feature extraction
    Unstable
    Estimate
    Language
    Consonant
    Continuous Speech

    Keywords

    • Consonants
    • Dynamic features
    • Language acquisition
    • Neural network

    ASJC Scopus subject areas

    • Language and Linguistics
    • Human-Computer Interaction
    • Signal Processing
    • Software
    • Modelling and Simulation

    Cite this

    Kouki, M., Hideaki, M., Kikuchi, H., & Reiko, M. (2011). The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 749-752)

    The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features. / Kouki, Miyazawa; Hideaki, Miura; Kikuchi, Hideaki; Reiko, Mazuka.

    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 749-752.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kouki, M, Hideaki, M, Kikuchi, H & Reiko, M 2011, The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 749-752, 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, Florence, Italy, 11/8/27.
    Kouki M, Hideaki M, Kikuchi H, Reiko M. The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 749-752
    Kouki, Miyazawa ; Hideaki, Miura ; Kikuchi, Hideaki ; Reiko, Mazuka. / The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. pp. 749-752
    @inproceedings{80df6179682b46aca179de1c9ae39154,
    title = "The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features",
    abstract = "It is unclear as to how infants learn the acoustic expression of each phoneme of their native languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However, these studies have used a limited vocabulary as input and do not handle a continuous speech that is almost comparable to a natural environment. Therefore, we use a natural continuous speech and build a self-organization model that simulates the cognitive ability of the humans, and we analyze the quality and quantity of the speech information that is necessary for the acquisition of the native phoneme system. Our model is designed to learn values of the acoustic features of a continuous speech and to estimate the number and boundaries of the phoneme categories without using explicit instructions. In a recent study, our model could acquire the detailed vowels of the input language. In this study, we examined the mechanism necessary for an infant to acquire all the phonemes of a language, including consonants. In natural speech, vowels have a stationary feature; hence, our recent model is suitable for learning them. However, learning consonants through the past model is difficult because most consonants have more dynamic features than vowels. To solve this problem, we designed a method to separate {"}stable{"} and {"}dynamic{"} speech patterns using a feature-extraction method based on the auditory expressions used by human beings. Using this method, we showed that the acquisition of an unstable phoneme was possible without the use of instructions.",
    keywords = "Consonants, Dynamic features, Language acquisition, Neural network",
    author = "Miyazawa Kouki and Miura Hideaki and Hideaki Kikuchi and Mazuka Reiko",
    year = "2011",
    language = "English",
    pages = "749--752",
    booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

    }

    TY - GEN

    T1 - The multi timescale phoneme acquisition model of the self-organizing based on the dynamic features

    AU - Kouki, Miyazawa

    AU - Hideaki, Miura

    AU - Kikuchi, Hideaki

    AU - Reiko, Mazuka

    PY - 2011

    Y1 - 2011

    N2 - It is unclear as to how infants learn the acoustic expression of each phoneme of their native languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However, these studies have used a limited vocabulary as input and do not handle a continuous speech that is almost comparable to a natural environment. Therefore, we use a natural continuous speech and build a self-organization model that simulates the cognitive ability of the humans, and we analyze the quality and quantity of the speech information that is necessary for the acquisition of the native phoneme system. Our model is designed to learn values of the acoustic features of a continuous speech and to estimate the number and boundaries of the phoneme categories without using explicit instructions. In a recent study, our model could acquire the detailed vowels of the input language. In this study, we examined the mechanism necessary for an infant to acquire all the phonemes of a language, including consonants. In natural speech, vowels have a stationary feature; hence, our recent model is suitable for learning them. However, learning consonants through the past model is difficult because most consonants have more dynamic features than vowels. To solve this problem, we designed a method to separate "stable" and "dynamic" speech patterns using a feature-extraction method based on the auditory expressions used by human beings. Using this method, we showed that the acquisition of an unstable phoneme was possible without the use of instructions.

    AB - It is unclear as to how infants learn the acoustic expression of each phoneme of their native languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However, these studies have used a limited vocabulary as input and do not handle a continuous speech that is almost comparable to a natural environment. Therefore, we use a natural continuous speech and build a self-organization model that simulates the cognitive ability of the humans, and we analyze the quality and quantity of the speech information that is necessary for the acquisition of the native phoneme system. Our model is designed to learn values of the acoustic features of a continuous speech and to estimate the number and boundaries of the phoneme categories without using explicit instructions. In a recent study, our model could acquire the detailed vowels of the input language. In this study, we examined the mechanism necessary for an infant to acquire all the phonemes of a language, including consonants. In natural speech, vowels have a stationary feature; hence, our recent model is suitable for learning them. However, learning consonants through the past model is difficult because most consonants have more dynamic features than vowels. To solve this problem, we designed a method to separate "stable" and "dynamic" speech patterns using a feature-extraction method based on the auditory expressions used by human beings. Using this method, we showed that the acquisition of an unstable phoneme was possible without the use of instructions.

    KW - Consonants

    KW - Dynamic features

    KW - Language acquisition

    KW - Neural network

    UR - http://www.scopus.com/inward/record.url?scp=84865765254&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84865765254&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:84865765254

    SP - 749

    EP - 752

    BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

    ER -