Speech spotter: On-demand speech recognition in human-human conversation on the telephone or in face-to-face situations

Masataka Goto, Koji Kitayama, Katunobu Itou, Tetsunori Kobayashi

Research output: Contribution to conferencePaperpeer-review

8 Citations (Scopus)

Abstract

This paper describes a novel speech-interface function, called "speech spotter", which enablesauserto enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to another person or a speech recognizer. We solve this problem by using two kinds of nonverbal speech information: a filled pause (a vowel-lengthening hesitation like "er⋯") and voice pitch. Only when a user utters a voice command with a high pitch just after a filled pause is the voice command accepted by the speech recognizer. By using this speech-spotter function, we have built two application systems: an on-demand information system for assisting human-human conversation and a music-playback system for enriching telephone conversation. The results from using these systems have shown that the speech-spotter function is robust and convenient enough to be used in face-to-face or cellular-phone conversations.

Original languageEnglish
Pages1533-1536
Number of pages4
Publication statusPublished - 2004 Jan 1
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 2004 Oct 42004 Oct 8

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island
Period04/10/404/10/8

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Speech spotter: On-demand speech recognition in human-human conversation on the telephone or in face-to-face situations'. Together they form a unique fingerprint.

Cite this