Tree-based unit selection for English speech synthesis

Wern Jun Wang, W. N. Campbell, Naoto Iwahashi, Yoshinori Sagisaka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In concatenative speech synthesis for English, scarcity of speech data for many contexts is a serious problem. In this paper, we propose a new unit selection scheme using a decision-tree-based clustering method that combines acoustic and linguistic knowledge with statistical modeling. This approach not only allows us to find a trainable and consistent set of generalized allophonic models but also to achieve some local optimality with respect to the limited training data. To evaluate the validity of this algorithm, regression tree generation has been carried out for both vowels and consonants from 200 phonetically balanced sentences read by a female speaker. Experimental results show that regression trees offer a promising solution for the data scarcity problem.

Original languageEnglish
Title of host publicationSpeech Processing
PublisherPubl by IEEE
PagesII-191-II-194
ISBN (Print)0780309464
Publication statusPublished - 1993 Jan 1
Externally publishedYes
Event1993 IEEE International Conference on Acoustics, Speech and Signal Processing - Minneapolis, MN, USA
Duration: 1993 Apr 271993 Apr 30

Publication series

NameProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Volume2
ISSN (Print)0736-7791

Other

Other1993 IEEE International Conference on Acoustics, Speech and Signal Processing
CityMinneapolis, MN, USA
Period93/4/2793/4/30

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Tree-based unit selection for English speech synthesis'. Together they form a unique fingerprint.

Cite this