In this paper, we propose a speech spectrum transformation method by interpolating spectral patterns between pre-stored multiple speakers for speech synthesis. Tlie interpolation is carried out using spectral parameters such as cepstrum and log area ratio to generate new spectrum patterns. The spectral patterns can be transforined smoothly as tlie iiiterpolation ratio is gradually changed, aid speech iiidividualitg caii easily be controlled between interpolated speakers. Adaptation to a target speaker can be peilornied by this interpolatiou, which uses only a small amount of training data to generate a new speech spectrum sequence close to the target speaker's. An adaptation experiment was carried out in the case of using only one word spoken by the target. speaker for learning. It was shown that the distance between the target speaker's spect.rnm and the spectrum generated by tlie proposed iuterpolation method is reduced by about 40% compared with distance between tlie target speaker's spectrum and spectrum of tlie speaker closest to the target ainoiig pre-stored ones.
|ジャーナル||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|出版ステータス||Published - 1994|
|イベント||Proceedings of the 1994 IEEE International Conference on Acoustics, Speech and Signal Processing. Part 2 (of 6) - Adelaide, Aust|
継続期間: 1994 4月 19 → 1994 4月 22
ASJC Scopus subject areas