Emotional speech synthesis by sensing affective information from text

Mostafa Al Masum Shaikh, Antonio Rui Ferreira Rebordao, Keikichi Hirose, Mitsuru Ishizuka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Speech can express subjective meanings and intents that, in order to be fully understood, rely heavily in its affective perception. Some Text-to-Speech (TTS) systems reveal weaknesses in their emotional expressivity but this situation can be improved by a better parametrization of the acoustic and prosodic parameters. This paper describes an approach for better emotional expressivity in a speech synthesizer. Our technique uses several linguistic resources that can recognize emotions in a text and assigns appropriate parameters to the synthesizer to carry out a suitable speech synthesis. For evaluation purposes we considered the MARY TTS system to readout "happy" and "sad" news. The preliminary perceptual test results are encouraging and human judges, by listening to the synthesized speech obtained with our approach, could perceive "happy" emotions much better than compared to when they listened nonaffective synthesized speech.

Original languageEnglish
Title of host publicationProceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009 - Amsterdam
Duration: 2009 Sep 102009 Sep 12

Other

Other2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009
CityAmsterdam
Period09/9/1009/9/12

Fingerprint

Speech synthesis
Linguistics
Acoustics

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Software

Cite this

Shaikh, M. A. M., Rebordao, A. R. F., Hirose, K., & Ishizuka, M. (2009). Emotional speech synthesis by sensing affective information from text. In Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009 [5349515] https://doi.org/10.1109/ACII.2009.5349515

Emotional speech synthesis by sensing affective information from text. / Shaikh, Mostafa Al Masum; Rebordao, Antonio Rui Ferreira; Hirose, Keikichi; Ishizuka, Mitsuru.

Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009. 2009. 5349515.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shaikh, MAM, Rebordao, ARF, Hirose, K & Ishizuka, M 2009, Emotional speech synthesis by sensing affective information from text. in Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009., 5349515, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009, Amsterdam, 09/9/10. https://doi.org/10.1109/ACII.2009.5349515
Shaikh MAM, Rebordao ARF, Hirose K, Ishizuka M. Emotional speech synthesis by sensing affective information from text. In Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009. 2009. 5349515 https://doi.org/10.1109/ACII.2009.5349515
Shaikh, Mostafa Al Masum ; Rebordao, Antonio Rui Ferreira ; Hirose, Keikichi ; Ishizuka, Mitsuru. / Emotional speech synthesis by sensing affective information from text. Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009. 2009.
@inproceedings{5b680b8733f3481ba318eb088aec12e2,
title = "Emotional speech synthesis by sensing affective information from text",
abstract = "Speech can express subjective meanings and intents that, in order to be fully understood, rely heavily in its affective perception. Some Text-to-Speech (TTS) systems reveal weaknesses in their emotional expressivity but this situation can be improved by a better parametrization of the acoustic and prosodic parameters. This paper describes an approach for better emotional expressivity in a speech synthesizer. Our technique uses several linguistic resources that can recognize emotions in a text and assigns appropriate parameters to the synthesizer to carry out a suitable speech synthesis. For evaluation purposes we considered the MARY TTS system to readout {"}happy{"} and {"}sad{"} news. The preliminary perceptual test results are encouraging and human judges, by listening to the synthesized speech obtained with our approach, could perceive {"}happy{"} emotions much better than compared to when they listened nonaffective synthesized speech.",
author = "Shaikh, {Mostafa Al Masum} and Rebordao, {Antonio Rui Ferreira} and Keikichi Hirose and Mitsuru Ishizuka",
year = "2009",
doi = "10.1109/ACII.2009.5349515",
language = "English",
isbn = "9781424447992",
booktitle = "Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009",

}

TY - GEN

T1 - Emotional speech synthesis by sensing affective information from text

AU - Shaikh, Mostafa Al Masum

AU - Rebordao, Antonio Rui Ferreira

AU - Hirose, Keikichi

AU - Ishizuka, Mitsuru

PY - 2009

Y1 - 2009

N2 - Speech can express subjective meanings and intents that, in order to be fully understood, rely heavily in its affective perception. Some Text-to-Speech (TTS) systems reveal weaknesses in their emotional expressivity but this situation can be improved by a better parametrization of the acoustic and prosodic parameters. This paper describes an approach for better emotional expressivity in a speech synthesizer. Our technique uses several linguistic resources that can recognize emotions in a text and assigns appropriate parameters to the synthesizer to carry out a suitable speech synthesis. For evaluation purposes we considered the MARY TTS system to readout "happy" and "sad" news. The preliminary perceptual test results are encouraging and human judges, by listening to the synthesized speech obtained with our approach, could perceive "happy" emotions much better than compared to when they listened nonaffective synthesized speech.

AB - Speech can express subjective meanings and intents that, in order to be fully understood, rely heavily in its affective perception. Some Text-to-Speech (TTS) systems reveal weaknesses in their emotional expressivity but this situation can be improved by a better parametrization of the acoustic and prosodic parameters. This paper describes an approach for better emotional expressivity in a speech synthesizer. Our technique uses several linguistic resources that can recognize emotions in a text and assigns appropriate parameters to the synthesizer to carry out a suitable speech synthesis. For evaluation purposes we considered the MARY TTS system to readout "happy" and "sad" news. The preliminary perceptual test results are encouraging and human judges, by listening to the synthesized speech obtained with our approach, could perceive "happy" emotions much better than compared to when they listened nonaffective synthesized speech.

UR - http://www.scopus.com/inward/record.url?scp=77949361299&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77949361299&partnerID=8YFLogxK

U2 - 10.1109/ACII.2009.5349515

DO - 10.1109/ACII.2009.5349515

M3 - Conference contribution

SN - 9781424447992

BT - Proceedings - 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009

ER -