Analysis and modeling of syllable duration for Thai speech synthesis

Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Rungkarn Siricharoenchai, Yoshinori Sagisaka

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.

    Original languageEnglish
    Title of host publicationEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
    PublisherInternational Speech Communication Association
    Pages93-96
    Number of pages4
    Publication statusPublished - 2003
    Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
    Duration: 2003 Sep 12003 Sep 4

    Other

    Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
    CountrySwitzerland
    CityGeneva
    Period03/9/103/9/4

    Fingerprint

    Speech synthesis
    role play
    linear model
    quantification
    Linear regression
    Linguistics
    linguistics
    regression
    Values

    ASJC Scopus subject areas

    • Computer Science Applications
    • Software
    • Linguistics and Language
    • Communication

    Cite this

    Hansakunbuntheung, C., Tesprasit, V., Siricharoenchai, R., & Sagisaka, Y. (2003). Analysis and modeling of syllable duration for Thai speech synthesis. In EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (pp. 93-96). International Speech Communication Association.

    Analysis and modeling of syllable duration for Thai speech synthesis. / Hansakunbuntheung, Chatchawarn; Tesprasit, Virongrong; Siricharoenchai, Rungkarn; Sagisaka, Yoshinori.

    EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology. International Speech Communication Association, 2003. p. 93-96.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Hansakunbuntheung, C, Tesprasit, V, Siricharoenchai, R & Sagisaka, Y 2003, Analysis and modeling of syllable duration for Thai speech synthesis. in EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology. International Speech Communication Association, pp. 93-96, 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, Geneva, Switzerland, 03/9/1.
    Hansakunbuntheung C, Tesprasit V, Siricharoenchai R, Sagisaka Y. Analysis and modeling of syllable duration for Thai speech synthesis. In EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology. International Speech Communication Association. 2003. p. 93-96
    Hansakunbuntheung, Chatchawarn ; Tesprasit, Virongrong ; Siricharoenchai, Rungkarn ; Sagisaka, Yoshinori. / Analysis and modeling of syllable duration for Thai speech synthesis. EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology. International Speech Communication Association, 2003. pp. 93-96
    @inproceedings{dd16475b7f104d148e19aae16e3192cf,
    title = "Analysis and modeling of syllable duration for Thai speech synthesis",
    abstract = "This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.",
    author = "Chatchawarn Hansakunbuntheung and Virongrong Tesprasit and Rungkarn Siricharoenchai and Yoshinori Sagisaka",
    year = "2003",
    language = "English",
    pages = "93--96",
    booktitle = "EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology",
    publisher = "International Speech Communication Association",

    }

    TY - GEN

    T1 - Analysis and modeling of syllable duration for Thai speech synthesis

    AU - Hansakunbuntheung, Chatchawarn

    AU - Tesprasit, Virongrong

    AU - Siricharoenchai, Rungkarn

    AU - Sagisaka, Yoshinori

    PY - 2003

    Y1 - 2003

    N2 - This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.

    AB - This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.

    UR - http://www.scopus.com/inward/record.url?scp=85009168093&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85009168093&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:85009168093

    SP - 93

    EP - 96

    BT - EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology

    PB - International Speech Communication Association

    ER -