TY - GEN
T1 - Global F0 control parameter prediction based on impressions for communicative prosody generation
AU - Shao, Lu
AU - Greenberg, Yoko
AU - Sagisaka, Yoshinori
PY - 2013/12/1
Y1 - 2013/12/1
N2 - Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.
AB - Aiming at communicative speech synthesis, prosody control using impressions has been proposed by applying the correlation between impressions of input lexicons and prosody. In this paper, as the first step to compute communicative prosody, we attempt to predict the F0 generation model parameters by estimating the impressions of input sentence from its constituent lexicons. To obtain an impression vector consisting of three dimensional factors (positive-negative, confident-doubtful and allowable-unacceptable) for a given input utterance, we proposed a computational scheme to calculate impression vectors using impression scores of constituent words. Using obtained sentence impression vectors, F0 control parameters are predicted by applying three-layered feed-forward neural networks. To evaluate the effectiveness of the proposed computational framework, we experimentally confirmed that F0 parameters of communicative speech could be generated from the impressions of input lexicons.
KW - communicative speech synthesis
KW - impression
KW - neural network
KW - speech prosody control
UR - http://www.scopus.com/inward/record.url?scp=84894120374&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84894120374&partnerID=8YFLogxK
U2 - 10.1109/ICSDA.2013.6709871
DO - 10.1109/ICSDA.2013.6709871
M3 - Conference contribution
AN - SCOPUS:84894120374
SN - 9781479923786
T3 - 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
BT - 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
T2 - 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013
Y2 - 25 November 2013 through 27 November 2013
ER -