抄録
This paper describes a method for automatically detecting filled (vocalized) pauses, which are one of the hesitation phenomena that current speech recognizers typically cannot handle. The detection of these pauses is important in spontaneous speech dialogue systems because they play valuable roles, such as helping a speaker keep a conversational turn, in oral communication. Although a few speech recognition systems have processed filled pauses within subword-based connected word recognition or word-spotting frameworks, they did not detect the pauses individually and consequently could not consider their roles. In this paper we propose a method that detects filled pauses and word lengthening on the basis of small fundamental frequency transition and small spectral envelope deformation under the assumption that speakers do not change articulator parameters during filled pauses. Experimental results for a Japanese spoken dialogue corpus show that our real-time filled-pause-detection system yielded a recall rate of 84.9% and a precision rate of 91.5%.
本文言語 | English |
---|---|
ページ | 227-230 |
ページ数 | 4 |
出版ステータス | Published - 1999 |
外部発表 | はい |
イベント | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, Hungary 継続期間: 1999 9月 5 → 1999 9月 9 |
Conference
Conference | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 |
---|---|
国/地域 | Hungary |
City | Budapest |
Period | 99/9/5 → 99/9/9 |
ASJC Scopus subject areas
- コンピュータ サイエンスの応用
- ソフトウェア
- 言語学および言語
- 通信