This paper describes an experimental study on the distance measures used to represent dynamic features of speech. The authors have been constructing a speech data base of various phonetic environments using the acoustic-phonetic segments. In the system, acoustic-phonetic segments of the transitional parts, from a consonant to a vowel or from a vowel to another vowel, have been labeled as individual segments. Then, a proper distance measure between those transitional segments is needed. In this paper, two distance measures are studied. One is to take distances between derivatives in the time direction of feature parameters, the other is to subtract the average value in the time direction from the matrix representation of feature parameters. Mel-cepstrum coefficients were used as feature parameters. Discrimination experiments were conducted using formant patterns and acoustic-phonetic segments of transitional parts of speech data base. It is shown that both of two distance measures are effective in the discrimination of the transitional parts of speech.
|ジャーナル||Denshi Gijutsu Sogo Kenkyusho Iho/Bulletin of the Electrotechnical Laboratory|
|出版ステータス||Published - 1988|
ASJC Scopus subject areas