Global F0 control parameter prediction based on impressions for communicative prosody generation

Lu Shao, Yoko Greenberg, Yoshinori Sagisaka

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    5 Citations (Scopus)

    Abstract

    Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.

    Original languageEnglish
    Title of host publication2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
    DOIs
    Publication statusPublished - 2013
    Event2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013 - Gurgaon
    Duration: 2013 Nov 252013 Nov 27

    Other

    Other2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
    CityGurgaon
    Period13/11/2513/11/27

    Fingerprint

    Speech synthesis
    Feedforward neural networks

    Keywords

    • communicative speech synthesis
    • impression
    • neural network
    • speech prosody control

    ASJC Scopus subject areas

    • Software

    Cite this

    Shao, L., Greenberg, Y., & Sagisaka, Y. (2013). Global F0 control parameter prediction based on impressions for communicative prosody generation. In 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013 [6709871] https://doi.org/10.1109/ICSDA.2013.6709871

    Global F0 control parameter prediction based on impressions for communicative prosody generation. / Shao, Lu; Greenberg, Yoko; Sagisaka, Yoshinori.

    2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013. 2013. 6709871.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Shao, L, Greenberg, Y & Sagisaka, Y 2013, Global F0 control parameter prediction based on impressions for communicative prosody generation. in 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013., 6709871, 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013, Gurgaon, 13/11/25. https://doi.org/10.1109/ICSDA.2013.6709871
    Shao L, Greenberg Y, Sagisaka Y. Global F0 control parameter prediction based on impressions for communicative prosody generation. In 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013. 2013. 6709871 https://doi.org/10.1109/ICSDA.2013.6709871
    Shao, Lu ; Greenberg, Yoko ; Sagisaka, Yoshinori. / Global F0 control parameter prediction based on impressions for communicative prosody generation. 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013. 2013.
    @inproceedings{6139947108d748ec929ca77af98a0694,
    title = "Global F0 control parameter prediction based on impressions for communicative prosody generation",
    abstract = "Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.",
    keywords = "communicative speech synthesis, impression, neural network, speech prosody control",
    author = "Lu Shao and Yoko Greenberg and Yoshinori Sagisaka",
    year = "2013",
    doi = "10.1109/ICSDA.2013.6709871",
    language = "English",
    isbn = "9781479923786",
    booktitle = "2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013",

    }

    TY - GEN

    T1 - Global F0 control parameter prediction based on impressions for communicative prosody generation

    AU - Shao, Lu

    AU - Greenberg, Yoko

    AU - Sagisaka, Yoshinori

    PY - 2013

    Y1 - 2013

    N2 - Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.

    AB - Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.

    KW - communicative speech synthesis

    KW - impression

    KW - neural network

    KW - speech prosody control

    UR - http://www.scopus.com/inward/record.url?scp=84894120374&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84894120374&partnerID=8YFLogxK

    U2 - 10.1109/ICSDA.2013.6709871

    DO - 10.1109/ICSDA.2013.6709871

    M3 - Conference contribution

    AN - SCOPUS:84894120374

    SN - 9781479923786

    BT - 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013

    ER -