Speech starter: Noise-robust endpoint detection by using filled pauses

Koji Kitayama, Masataka Goto, Katunobu Itou, Tetsunori Kobayashi

    研究成果: Conference contribution

    9 引用 (Scopus)

    抜粋

    In this paper we propose a speech interface function, called speech starter, that enables noise-robust endpoint (utterance) detection for speech recognition. When current speech recognizers are used in a noisy environment, a typical recognition error is caused by incorrect endpoints because their automatic detection is likely to be disturbed by non-stationary noises. The speech starter function enables a user to specify the beginning of each utterance by uttering a filler with a filled pause, which is used as a trigger to start speech-recognition processes. Since filled pauses can be detected robustly in a noisy environment, practical endpoint detection is achieved. Speech starter also offers the advantage of providing a hands-free speech interface and it is user-friendly because a speaker tends to utter filled pauses (e.g., "er.") at the beginning of utterances when hesitating in human-human communication. Experimental results from a 10-dB-SNR noisy environment show that the recognition error rate with speech starter was lower than with conventional endpoint-detection methods.

    元の言語English
    ホスト出版物のタイトルEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
    出版者International Speech Communication Association
    ページ1237-1240
    ページ数4
    出版物ステータスPublished - 2003
    イベント8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
    継続期間: 2003 9 12003 9 4

    Other

    Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
    Switzerland
    Geneva
    期間03/9/103/9/4

      フィンガープリント

    ASJC Scopus subject areas

    • Computer Science Applications
    • Software
    • Linguistics and Language
    • Communication

    これを引用

    Kitayama, K., Goto, M., Itou, K., & Kobayashi, T. (2003). Speech starter: Noise-robust endpoint detection by using filled pauses. : EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (pp. 1237-1240). International Speech Communication Association.