Global F0 control parameter prediction based on impressions for communicative prosody generation

Lu Shao, Yoko Greenberg, Yoshinori Sagisaka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.

Original languageEnglish
Title of host publication2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
DOIs
Publication statusPublished - 2013 Dec 1
Event2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013 - Gurgaon, India
Duration: 2013 Nov 252013 Nov 27

Publication series

Name2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013

Conference

Conference2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
CountryIndia
CityGurgaon
Period13/11/2513/11/27

    Fingerprint

Keywords

  • communicative speech synthesis
  • impression
  • neural network
  • speech prosody control

ASJC Scopus subject areas

  • Software

Cite this

Shao, L., Greenberg, Y., & Sagisaka, Y. (2013). Global F0 control parameter prediction based on impressions for communicative prosody generation. In 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013 [6709871] (2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013). https://doi.org/10.1109/ICSDA.2013.6709871