Belief network based disambiguation of object reference in spoken dialogue system

Yoko Yamakata, Tatsuya Kawahara, Hiroshi G. Okuno, Michihiko Minoh

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

This paper discusses a problem of human-machine interaction when spoken word to object reference ambiguity occurs. We study joint activity of several agents in which a remote robot finds an object while communicating with the user over a voice-only channel. We focus on the problem in which the robot disambiguates the reference of the uttered word or phrase to the target object. For example, the utterance of the word "cup" may refer to a "teacup", a "coffee cup", or even a "glass" for different users in some situations. This reference (hereafter, "object reference") is user and situation dependent. We conducted two experiments. The first experiment including 12 subjects confirmed that the user model of object references is significant. In the second experiment conducted on 20 subjects, we show the model reference sensitivity to the situation. In addition to the ambiguity of the object reference, the actual system must cope with two sources of uncertainty: speech and image recognition. We present the belief network based probabilistic reasoning system to determine the object reference. The resulting system demonstrates that the number of interactions needed to find a common reference is reduced as the user model is refined.

Original languageEnglish
Pages (from-to)47-56
Number of pages10
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume19
Issue number1
DOIs
Publication statusPublished - 2004
Externally publishedYes

Fingerprint

Bayesian networks
Robots
Coffee
Image recognition
Experiments
Speech recognition
Glass

Keywords

  • Belief network
  • Object reference
  • Spoken dialogue
  • User model

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Belief network based disambiguation of object reference in spoken dialogue system. / Yamakata, Yoko; Kawahara, Tatsuya; Okuno, Hiroshi G.; Minoh, Michihiko.

In: Transactions of the Japanese Society for Artificial Intelligence, Vol. 19, No. 1, 2004, p. 47-56.

Research output: Contribution to journalArticle

@article{459571e8a3b948b18d3c53e4344c9f1e,
title = "Belief network based disambiguation of object reference in spoken dialogue system",
abstract = "This paper discusses a problem of human-machine interaction when spoken word to object reference ambiguity occurs. We study joint activity of several agents in which a remote robot finds an object while communicating with the user over a voice-only channel. We focus on the problem in which the robot disambiguates the reference of the uttered word or phrase to the target object. For example, the utterance of the word {"}cup{"} may refer to a {"}teacup{"}, a {"}coffee cup{"}, or even a {"}glass{"} for different users in some situations. This reference (hereafter, {"}object reference{"}) is user and situation dependent. We conducted two experiments. The first experiment including 12 subjects confirmed that the user model of object references is significant. In the second experiment conducted on 20 subjects, we show the model reference sensitivity to the situation. In addition to the ambiguity of the object reference, the actual system must cope with two sources of uncertainty: speech and image recognition. We present the belief network based probabilistic reasoning system to determine the object reference. The resulting system demonstrates that the number of interactions needed to find a common reference is reduced as the user model is refined.",
keywords = "Belief network, Object reference, Spoken dialogue, User model",
author = "Yoko Yamakata and Tatsuya Kawahara and Okuno, {Hiroshi G.} and Michihiko Minoh",
year = "2004",
doi = "10.1527/tjsai.19.47",
language = "English",
volume = "19",
pages = "47--56",
journal = "Transactions of the Japanese Society for Artificial Intelligence",
issn = "1346-0714",
publisher = "Japanese Society for Artificial Intelligence",
number = "1",

}

TY - JOUR

T1 - Belief network based disambiguation of object reference in spoken dialogue system

AU - Yamakata, Yoko

AU - Kawahara, Tatsuya

AU - Okuno, Hiroshi G.

AU - Minoh, Michihiko

PY - 2004

Y1 - 2004

N2 - This paper discusses a problem of human-machine interaction when spoken word to object reference ambiguity occurs. We study joint activity of several agents in which a remote robot finds an object while communicating with the user over a voice-only channel. We focus on the problem in which the robot disambiguates the reference of the uttered word or phrase to the target object. For example, the utterance of the word "cup" may refer to a "teacup", a "coffee cup", or even a "glass" for different users in some situations. This reference (hereafter, "object reference") is user and situation dependent. We conducted two experiments. The first experiment including 12 subjects confirmed that the user model of object references is significant. In the second experiment conducted on 20 subjects, we show the model reference sensitivity to the situation. In addition to the ambiguity of the object reference, the actual system must cope with two sources of uncertainty: speech and image recognition. We present the belief network based probabilistic reasoning system to determine the object reference. The resulting system demonstrates that the number of interactions needed to find a common reference is reduced as the user model is refined.

AB - This paper discusses a problem of human-machine interaction when spoken word to object reference ambiguity occurs. We study joint activity of several agents in which a remote robot finds an object while communicating with the user over a voice-only channel. We focus on the problem in which the robot disambiguates the reference of the uttered word or phrase to the target object. For example, the utterance of the word "cup" may refer to a "teacup", a "coffee cup", or even a "glass" for different users in some situations. This reference (hereafter, "object reference") is user and situation dependent. We conducted two experiments. The first experiment including 12 subjects confirmed that the user model of object references is significant. In the second experiment conducted on 20 subjects, we show the model reference sensitivity to the situation. In addition to the ambiguity of the object reference, the actual system must cope with two sources of uncertainty: speech and image recognition. We present the belief network based probabilistic reasoning system to determine the object reference. The resulting system demonstrates that the number of interactions needed to find a common reference is reduced as the user model is refined.

KW - Belief network

KW - Object reference

KW - Spoken dialogue

KW - User model

UR - http://www.scopus.com/inward/record.url?scp=18444391199&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18444391199&partnerID=8YFLogxK

U2 - 10.1527/tjsai.19.47

DO - 10.1527/tjsai.19.47

M3 - Article

VL - 19

SP - 47

EP - 56

JO - Transactions of the Japanese Society for Artificial Intelligence

JF - Transactions of the Japanese Society for Artificial Intelligence

SN - 1346-0714

IS - 1

ER -