TY - GEN
T1 - Improving identification accuracy by extending acceptable utterances in spoken dialogue system using barge-in timing
AU - Matsuyama, Kyoko
AU - Komatani, Kazunori
AU - Takahashi, Toru
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2010
Y1 - 2010
N2 - We describe a novel dialogue strategy enabling robust interaction under noisy environments where automatic speech recognition (ASR) results are not necessarily reliable. We have developed a method that exploits utterance timing together with ASR results to interpret user intention, that is, to identify one item that a user wants to indicate from system enumeration. The timing of utterances containing referential expressions is approximated by Gamma distribution, which is integrated with ASR results by expressing both of them as probabilities. In this paper, we improve the identification accuracy by extending the method. First, we enable interpretation of utterances including ordinal numbers, which appear several times in our data collected from users. Then we use proper acoustic models and parameters, improving the identification accuracy by 4.0% in total. We also show that Latent Semantic Mapping (LSM) enables more expressions to be handled in our framework.
AB - We describe a novel dialogue strategy enabling robust interaction under noisy environments where automatic speech recognition (ASR) results are not necessarily reliable. We have developed a method that exploits utterance timing together with ASR results to interpret user intention, that is, to identify one item that a user wants to indicate from system enumeration. The timing of utterances containing referential expressions is approximated by Gamma distribution, which is integrated with ASR results by expressing both of them as probabilities. In this paper, we improve the identification accuracy by extending the method. First, we enable interpretation of utterances including ordinal numbers, which appear several times in our data collected from users. Then we use proper acoustic models and parameters, improving the identification accuracy by 4.0% in total. We also show that Latent Semantic Mapping (LSM) enables more expressions to be handled in our framework.
KW - barge-in
KW - conversational interaction
KW - spoken dialogue systems
KW - utterance timing
UR - http://www.scopus.com/inward/record.url?scp=79551518947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79551518947&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-13025-0_60
DO - 10.1007/978-3-642-13025-0_60
M3 - Conference contribution
AN - SCOPUS:79551518947
SN - 3642130240
SN - 9783642130243
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 585
EP - 594
BT - Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Proceedings
T2 - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligence Systems, IEA/AIE 2010
Y2 - 1 June 2010 through 4 June 2010
ER -