Speech spotter: On-demand speech recognition in human-human conversation on the telephone or in face-to-face situations

Masataka Goto, Koji Kitayama, Katunobu Itou, Tetsunori Kobayashi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    8 Citations (Scopus)

    Abstract

    This paper describes a novel speech-interface function, called "speech spotter", which enablesauserto enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to another person or a speech recognizer. We solve this problem by using two kinds of nonverbal speech information: a filled pause (a vowel-lengthening hesitation like "er⋯") and voice pitch. Only when a user utters a voice command with a high pitch just after a filled pause is the voice command accepted by the speech recognizer. By using this speech-spotter function, we have built two application systems: an on-demand information system for assisting human-human conversation and a music-playback system for enriching telephone conversation. The results from using these systems have shown that the speech-spotter function is robust and convenient enough to be used in face-to-face or cellular-phone conversations.

    Original languageEnglish
    Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
    PublisherInternational Speech Communication Association
    Pages1533-1536
    Number of pages4
    Publication statusPublished - 2004
    Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
    Duration: 2004 Oct 42004 Oct 8

    Other

    Other8th International Conference on Spoken Language Processing, ICSLP 2004
    CountryKorea, Republic of
    CityJeju, Jeju Island
    Period04/10/404/10/8

      Fingerprint

    ASJC Scopus subject areas

    • Language and Linguistics
    • Linguistics and Language

    Cite this

    Goto, M., Kitayama, K., Itou, K., & Kobayashi, T. (2004). Speech spotter: On-demand speech recognition in human-human conversation on the telephone or in face-to-face situations. In 8th International Conference on Spoken Language Processing, ICSLP 2004 (pp. 1533-1536). International Speech Communication Association.