Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this study, we investigated speaker's intentions that the listeners perceive through subtly different sentence-final intonations. Approximately 2,000 sentence utterances were recorded and the fundamental frequency (F0) contours at the last vowel of those sentences were classified through one of the standard clustering algorithms. There found various F0 contours, namely, not only simple rising and falling intonations but also rise-fall and fall-rise intonations. In order to reveal the relationship between the intonation and the intentions, 10 representative contours were selected on the basis of the results of the clustering. Using the selected contours, a subjective evaluation was conducted. Six Japanese sentences that could have different meanings according to the sentence-final intonations were synthesized and the F0 contour at the last vowel of each sentence was replaced with the contours. The results of the evaluation by nine listeners showed that, for example, a certain falling intonation could express the intention of the "conviction" and another one that slightly differ in the shape could convey "doubt." It was found that the subtle difference in the sentence-final F0 shape conveyed various nuances and connotations.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages442-445
Number of pages4
Volume1
Publication statusPublished - 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR
Duration: 2012 Sep 92012 Sep 13

Other

Other13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CityPortland, OR
Period12/9/912/9/13

Fingerprint

Speech synthesis
Clustering algorithms
listener
evaluation

Keywords

  • Sentence-final intonation
  • Speaker's intention
  • Speech synthesis

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Cite this

Iwata, K., & Kobayashi, T. (2012). Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (Vol. 1, pp. 442-445)

Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis. / Iwata, Kazuhiko; Kobayashi, Tetsunori.

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 1 2012. p. 442-445.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Iwata, K & Kobayashi, T 2012, Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis. in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. vol. 1, pp. 442-445, 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Portland, OR, 12/9/9.
Iwata K, Kobayashi T. Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 1. 2012. p. 442-445
Iwata, Kazuhiko ; Kobayashi, Tetsunori. / Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. Vol. 1 2012. pp. 442-445
@inproceedings{65bf5013c56748bcaad4a6894b26e2af,
title = "Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis",
abstract = "In this study, we investigated speaker's intentions that the listeners perceive through subtly different sentence-final intonations. Approximately 2,000 sentence utterances were recorded and the fundamental frequency (F0) contours at the last vowel of those sentences were classified through one of the standard clustering algorithms. There found various F0 contours, namely, not only simple rising and falling intonations but also rise-fall and fall-rise intonations. In order to reveal the relationship between the intonation and the intentions, 10 representative contours were selected on the basis of the results of the clustering. Using the selected contours, a subjective evaluation was conducted. Six Japanese sentences that could have different meanings according to the sentence-final intonations were synthesized and the F0 contour at the last vowel of each sentence was replaced with the contours. The results of the evaluation by nine listeners showed that, for example, a certain falling intonation could express the intention of the {"}conviction{"} and another one that slightly differ in the shape could convey {"}doubt.{"} It was found that the subtle difference in the sentence-final F0 shape conveyed various nuances and connotations.",
keywords = "Sentence-final intonation, Speaker's intention, Speech synthesis",
author = "Kazuhiko Iwata and Tetsunori Kobayashi",
year = "2012",
language = "English",
isbn = "9781622767595",
volume = "1",
pages = "442--445",
booktitle = "13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012",

}

TY - GEN

T1 - Expressing speaker's intentions through sentence-final intonations for Japanese conversational speech synthesis

AU - Iwata, Kazuhiko

AU - Kobayashi, Tetsunori

PY - 2012

Y1 - 2012

N2 - In this study, we investigated speaker's intentions that the listeners perceive through subtly different sentence-final intonations. Approximately 2,000 sentence utterances were recorded and the fundamental frequency (F0) contours at the last vowel of those sentences were classified through one of the standard clustering algorithms. There found various F0 contours, namely, not only simple rising and falling intonations but also rise-fall and fall-rise intonations. In order to reveal the relationship between the intonation and the intentions, 10 representative contours were selected on the basis of the results of the clustering. Using the selected contours, a subjective evaluation was conducted. Six Japanese sentences that could have different meanings according to the sentence-final intonations were synthesized and the F0 contour at the last vowel of each sentence was replaced with the contours. The results of the evaluation by nine listeners showed that, for example, a certain falling intonation could express the intention of the "conviction" and another one that slightly differ in the shape could convey "doubt." It was found that the subtle difference in the sentence-final F0 shape conveyed various nuances and connotations.

AB - In this study, we investigated speaker's intentions that the listeners perceive through subtly different sentence-final intonations. Approximately 2,000 sentence utterances were recorded and the fundamental frequency (F0) contours at the last vowel of those sentences were classified through one of the standard clustering algorithms. There found various F0 contours, namely, not only simple rising and falling intonations but also rise-fall and fall-rise intonations. In order to reveal the relationship between the intonation and the intentions, 10 representative contours were selected on the basis of the results of the clustering. Using the selected contours, a subjective evaluation was conducted. Six Japanese sentences that could have different meanings according to the sentence-final intonations were synthesized and the F0 contour at the last vowel of each sentence was replaced with the contours. The results of the evaluation by nine listeners showed that, for example, a certain falling intonation could express the intention of the "conviction" and another one that slightly differ in the shape could convey "doubt." It was found that the subtle difference in the sentence-final F0 shape conveyed various nuances and connotations.

KW - Sentence-final intonation

KW - Speaker's intention

KW - Speech synthesis

UR - http://www.scopus.com/inward/record.url?scp=84878387409&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878387409&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84878387409

SN - 9781622767595

VL - 1

SP - 442

EP - 445

BT - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012

ER -