抄録
Aiming at automatic estimation of naturalness in timing control of non-native's speech, we have analyzed the timing characteristics of non-native's speech to correlate with corresponding subjective naturalness evaluation scores given by native speakers. In addition to word level statistical characteristics showing the differences between natives and non-natives, we analyzed phone and syllable level statistics to attain an objective measure better fit to natives' judgments. An English speech corpus spoken by Japanese was collected with temporal naturalness judgments by natives. The analysis results showed that timing differences between natives and non-natives in average syllable durations, weak vowel durations and vowel duration of function words were highly correlated with natives' naturalness evaluations. A liner regression model and a regression tree model were employed to estimate naturalness evaluation score from differences between native's speech and non-natives one. The proposed naturalness evaluation model was tested its estimation accuracy using open data. The root mean square errors between predicted scores by the two models and scores given by the natives turned out to be 0.63 and 0.66 comparable to the differences 0.70 of scores among native listeners respectively. These accuracies were better than one estimated by the model using word statistics only.
本文言語 | English |
---|---|
ページ | 1673-1676 |
ページ数 | 4 |
出版ステータス | Published - 2004 |
外部発表 | はい |
イベント | 8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of 継続期間: 2004 10月 4 → 2004 10月 8 |
Other
Other | 8th International Conference on Spoken Language Processing, ICSLP 2004 |
---|---|
国/地域 | Korea, Republic of |
City | Jeju, Jeju Island |
Period | 04/10/4 → 04/10/8 |
ASJC Scopus subject areas
- 言語および言語学
- 言語学および言語