Data-driven speech animation synthesis focusing on realistic inside of the mouth

Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, Shigeo Morishima

    Research output: Contribution to journalArticle

    8 Citations (Scopus)

    Abstract

    Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods.

    Original languageEnglish
    Pages (from-to)401-409
    Number of pages9
    JournalJournal of Information Processing
    Volume22
    Issue number2
    DOIs
    Publication statusPublished - 2014

    Fingerprint

    Animation
    Computer graphics
    Labeling
    Textures

    Keywords

    • Detai-lization
    • Inner mouth
    • Phoneme combination
    • Skull bone
    • Speech animation

    ASJC Scopus subject areas

    • Computer Science(all)

    Cite this

    Data-driven speech animation synthesis focusing on realistic inside of the mouth. / Kawai, Masahide; Iwao, Tomoyori; Mima, Daisuke; Maejima, Akinobu; Morishima, Shigeo.

    In: Journal of Information Processing, Vol. 22, No. 2, 2014, p. 401-409.

    Research output: Contribution to journalArticle

    Kawai, Masahide ; Iwao, Tomoyori ; Mima, Daisuke ; Maejima, Akinobu ; Morishima, Shigeo. / Data-driven speech animation synthesis focusing on realistic inside of the mouth. In: Journal of Information Processing. 2014 ; Vol. 22, No. 2. pp. 401-409.
    @article{5aff0102852845fba8d38918412a3911,
    title = "Data-driven speech animation synthesis focusing on realistic inside of the mouth",
    abstract = "Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods.",
    keywords = "Detai-lization, Inner mouth, Phoneme combination, Skull bone, Speech animation",
    author = "Masahide Kawai and Tomoyori Iwao and Daisuke Mima and Akinobu Maejima and Shigeo Morishima",
    year = "2014",
    doi = "10.2197/ipsjjip.22.401",
    language = "English",
    volume = "22",
    pages = "401--409",
    journal = "Journal of Information Processing",
    issn = "0387-5806",
    publisher = "Information Processing Society of Japan",
    number = "2",

    }

    TY - JOUR

    T1 - Data-driven speech animation synthesis focusing on realistic inside of the mouth

    AU - Kawai, Masahide

    AU - Iwao, Tomoyori

    AU - Mima, Daisuke

    AU - Maejima, Akinobu

    AU - Morishima, Shigeo

    PY - 2014

    Y1 - 2014

    N2 - Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods.

    AB - Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue's tip with teeth and tongue's back hasn't been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of the mouth. First, we classify inner mouth into teeth labeling opening distance of the teeth and a tongue according to phoneme information. We then insert them into existing speech animation based on opening distance of the teeth and phoneme information. Finally, we apply patch-based texture synthesis technique with a 2,213 images database created from 7 subjects to the resulting animation. By using the proposed method, we can automatically generate a speech animation with the realistic inner mouth from the existing speech animation created by previous methods.

    KW - Detai-lization

    KW - Inner mouth

    KW - Phoneme combination

    KW - Skull bone

    KW - Speech animation

    UR - http://www.scopus.com/inward/record.url?scp=84898663109&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84898663109&partnerID=8YFLogxK

    U2 - 10.2197/ipsjjip.22.401

    DO - 10.2197/ipsjjip.22.401

    M3 - Article

    VL - 22

    SP - 401

    EP - 409

    JO - Journal of Information Processing

    JF - Journal of Information Processing

    SN - 0387-5806

    IS - 2

    ER -