Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization

Naoto Iwahashi, Nobuyoshi Kaiki, Yoshinori Sagisaka

研究成果: Article

25 引用 (Scopus)

抜粋

This paper proposes a new scheme for concatenative speech synthesis to improve the speech segment selection procedure. The proposed scheme selects a segment sequence for concatenation by minimizing acoustic distortions between the selected segment and the desired spectrum for the target without the use of heuristics. Four types of distortion, a) the spectral prototypically of a segment, b) the spectral difference between the source and target contexts, c) the degradation resulting from concatenation of phonemes, and d) the acoustic discontinuity between the concatenated segments, are formulated as acoustic quantities, and used as measures for minimization. A search method for selecting segments from a large speech database is also described. In this method, a three-step optimization using dynamic programming is used to minimize the four types of distortion. A perceptual test shows that this proposed segment selection method with minimum distortion criteria produces high quality synthesized speech, and that contextual spectral difference and acoustic discontinuity at the segment boundary are important measures for improving the quality.

元の言語English
ページ(範囲)1942-1948
ページ数7
ジャーナルIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
E76-A
発行部数11
出版物ステータスPublished - 1993 11 1
外部発表Yes

ASJC Scopus subject areas

  • Signal Processing
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering
  • Applied Mathematics

フィンガープリント Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用