Concatenative speech synthesis by minimum distortion criteria

Naoto Iwahashi, Nobuyoshi Kaiki, Yoshinori Sagisaka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

A new scheme is proposed for concatenative speech synthesis to improve the segment selection procedure by minimizing acoustic distortions between the selected segment and the desired spectrum for the target. The spectral pro-totypicality of a segment, the spectral difference between the source and target contexts, the degradation resulting from concatenation of phonemes, and the acoustic continuity between the concatenated segments are all considered as measures. A search method for selecting segments from a large speech database is also described. In this method, a three-step optimization is used for distortion minimization. A perceptual test shows that contextual spectral difference and acoustic continuity at the segment boundary are important measures for improving the quality of synthesized speech.

Original languageEnglish
Title of host publicationICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages65-68
Number of pages4
ISBN (Electronic)0780305329
DOIs
Publication statusPublished - 1992
Externally publishedYes
Event1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992 - San Francisco, United States
Duration: 1992 Mar 231992 Mar 26

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2
ISSN (Print)1520-6149

Other

Other1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992
Country/TerritoryUnited States
CitySan Francisco
Period92/3/2392/3/26

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Concatenative speech synthesis by minimum distortion criteria'. Together they form a unique fingerprint.

Cite this