STATISTICAL MODELING AND RECOGNITION OF RHYTHM IN SPEECH

Satoru Hayamizu, Kazuyo Tanaka

研究成果: Paper査読

抄録

This paper proposes a new framework for processing rhythm in speech where temporal types are recognized using statistical models of mora durations. Temporal patterns, such as rhythm and tempo in speech, contain some basic information about communication through the spoken language. This information has not yet been fully used in speech recognition. This paper proposes that temporal types themselves be modeled and recognized by statistical models. Using the ASJ Continuous Speech Database, experiments for recognizing temporal types of bunsetsu (short phrases) were conducted. Approximately 72% of temporal types were identified correctly using these models, without using information about the length of pauses and fundamental frequencies. The recognized types were very consistent (approximately 94% were of the same types) for closed and open models. These results show the promising potential of the proposed framework.

本文言語English
ページ199-202
ページ数4
出版ステータスPublished - 1994
外部発表はい
イベント3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan
継続期間: 1994 9月 181994 9月 22

Conference

Conference3rd International Conference on Spoken Language Processing, ICSLP 1994
国/地域Japan
CityYokohama
Period94/9/1894/9/22

ASJC Scopus subject areas

  • 言語および言語学
  • 言語学および言語

フィンガープリント

「STATISTICAL MODELING AND RECOGNITION OF RHYTHM IN SPEECH」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル