TY - JOUR
T1 - Managing out-of-grammar utterances by topic estimation with domain extensibility in multi-domain spoken dialogue systems
AU - Komatani, Kazunori
AU - Ikeda, Satoshi
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
N1 - Funding Information:
We are grateful to Mr. Teruhisa Misu and Prof. Tatsuya Kawahara of Kyoto University for allowing us to use the document-collecting program they developed ( Misu and Kawahara, 2006 ). This work was supported in part by a grant-in-aid for scientific research from the Ministry of Education, Culture, Sports, Science and Technology of Japan and from Support Center for Advanced Telecommunications Technology Research (SCAT).
PY - 2008/10
Y1 - 2008/10
N2 - Spoken dialogue systems must inevitably deal with out-of-grammar utterances. We address this problem in multi-domain spoken dialogue systems, which deal with more tasks than a single-domain system. We defined a topic by augmenting a domain about which users want to find more information, and we developed a method of recovering out-of-grammar utterances based on topic estimation, i.e., by providing a help message in the estimated domain. Moreover, domain extensibility, that is, the ability to add new domains to the system, should be inherently retained in multi-domain systems. To estimate domains without sacrificing extensibility, we collected documents from the Web as training data. Since the data contained a certain amount of noise, we used latent semantic mapping (LSM), which enables robust topic estimation by removing the effects of noise from the data. Experimental results showed that our method improved topic estimation accuracy by 23.2 points for data including out-of-grammar utterances.
AB - Spoken dialogue systems must inevitably deal with out-of-grammar utterances. We address this problem in multi-domain spoken dialogue systems, which deal with more tasks than a single-domain system. We defined a topic by augmenting a domain about which users want to find more information, and we developed a method of recovering out-of-grammar utterances based on topic estimation, i.e., by providing a help message in the estimated domain. Moreover, domain extensibility, that is, the ability to add new domains to the system, should be inherently retained in multi-domain systems. To estimate domains without sacrificing extensibility, we collected documents from the Web as training data. Since the data contained a certain amount of noise, we used latent semantic mapping (LSM), which enables robust topic estimation by removing the effects of noise from the data. Experimental results showed that our method improved topic estimation accuracy by 23.2 points for data including out-of-grammar utterances.
KW - Domain extensibility
KW - Multi-domain spoken dialogue system
KW - Out-of-grammar utterance
KW - Topic estimation
UR - http://www.scopus.com/inward/record.url?scp=52949125236&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=52949125236&partnerID=8YFLogxK
U2 - 10.1016/j.specom.2008.05.010
DO - 10.1016/j.specom.2008.05.010
M3 - Article
AN - SCOPUS:52949125236
VL - 50
SP - 863
EP - 870
JO - Speech Communication
JF - Speech Communication
SN - 0167-6393
IS - 10
ER -