Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems

Kyoko Matsuyama, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages252-255
Number of pages4
Publication statusPublished - 2009
Externally publishedYes
Event10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
Duration: 2009 Sep 62009 Sep 10

Other

Other10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
CountryUnited Kingdom
CityBrighton
Period09/9/609/9/10

Fingerprint

Barges
Speech recognition
Blind source separation
Independent component analysis
Maximum likelihood

Keywords

  • Barge-in
  • Conversational interaction
  • Spoken dialogue system
  • Utterance timing

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Cite this

Matsuyama, K., Komatani, K., Ogata, T., & Okuno, H. G. (2009). Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 252-255)

Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems. / Matsuyama, Kyoko; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 252-255.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Matsuyama, K, Komatani, K, Ogata, T & Okuno, HG 2009, Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 252-255, 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, United Kingdom, 09/9/6.
Matsuyama K, Komatani K, Ogata T, Okuno HG. Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 252-255
Matsuyama, Kyoko ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. pp. 252-255
@inproceedings{717395baf6da481cb5269d37248f72a7,
title = "Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems",
abstract = "In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.",
keywords = "Barge-in, Conversational interaction, Spoken dialogue system, Utterance timing",
author = "Kyoko Matsuyama and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2009",
language = "English",
pages = "252--255",
booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

}

TY - GEN

T1 - Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems

AU - Matsuyama, Kyoko

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2009

Y1 - 2009

N2 - In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.

AB - In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.

KW - Barge-in

KW - Conversational interaction

KW - Spoken dialogue system

KW - Utterance timing

UR - http://www.scopus.com/inward/record.url?scp=70450162205&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70450162205&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:70450162205

SP - 252

EP - 255

BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

ER -