DEMIPHONEME NETWORK REPRESENTATION OF SPEECH AND AUTOMATIC LABELING TECHNIQUES FOR SPEECH DATA BASE CONSTRUCTION.

Kazuyo Tanaka*, Satoru Hayamizu, Kozo Ohta

*この研究の対応する著者

研究成果: Conference article査読

10 被引用数 (Scopus)

抄録

An automatic labeling technique for known speech samples is proposed to construct a fine speech database for investigating the acoustic-phonetic characteristics of speech. An acoustically compact descriptive unit called a demiphoneme (DPH) is introduced, and a word (or sentence) is represented by a network using DPHs which cover the acoustic variation contained in the utterances of the word (or sentence). An input speech sample is segmented and labeled to produce the optimal DPH sequence by the following algorithm: (a) Generate possible DPH sequences from an input phoneme sequence by rules. (b) Segment the sample parameter sequence. The resultant segments (called SEGs) are the candidates of DPH boundaries. (c) Determine the optimal correspondence between the SEG sequence and each of the DPH sequences generated in (b). (d) Decide the minimum-error DPH sequence and corresponding SEG boundaries. The feasibility of the method is confirmed by applying it to a word set containing 53 city names.

本文言語English
ページ(範囲)309-312
ページ数4
ジャーナルICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
出版ステータスPublished - 1986
外部発表はい

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「DEMIPHONEME NETWORK REPRESENTATION OF SPEECH AND AUTOMATIC LABELING TECHNIQUES FOR SPEECH DATA BASE CONSTRUCTION.」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル