TY - GEN
T1 - Dialog state tracking with attention-based sequence-to-sequence learning
AU - Hori, Takaaki
AU - Wang, Hai
AU - Hori, Chiori
AU - Watanabe, Shinji
AU - Harsham, Bret
AU - Le Roux, Jonathan
AU - Hershey, John R.
AU - Koji, Yusuke
AU - Jing, Yi
AU - Zhu, Zhaocheng
AU - Aikawa, Takeyuki
N1 - Funding Information:
Financial support from the National Science Foundation of China (Grant No. 50021101), the Ministry of Science and Technology of China (Grant No. 1999064505) and Max-Planck Society of Germany is acknowledged. The authors thank Prof M.L. Sui, Drs W. Wang, L. Zhang, Q.H. Lu, X.J. Gu and J. Zhong for their advice and valuable discussions.
Publisher Copyright:
© 2016 IEEE.
PY - 2017/2/7
Y1 - 2017/2/7
N2 - We present an advanced dialog state tracking system designed for the 5th Dialog State Tracking Challenge (DSTC5). The main task of DSTC5 is to track the dialog state in a human-human dialog. For each utterance, the tracker emits a frame of slot-value pairs considering the full history of the dialog up to the current turn. Our system includes an encoder-decoder architecture with an attention mechanism to map an input word sequence to a set of semantic labels, i.e., slot-value pairs. This handles the problem of the unknown alignment between the utterances and the labels. By combining the attention-based tracker with rule-based trackers elaborated for English and Chinese, the F-score for the development set improved from 0.475 to 0.507 compared to the rule-only trackers. Moreover, we achieved 0.517 F-score by refining the combination strategy based on the topic and slot level performance of each tracker. In this paper, we also validate the efficacy of each technique and report the test set results submitted to the challenge.
AB - We present an advanced dialog state tracking system designed for the 5th Dialog State Tracking Challenge (DSTC5). The main task of DSTC5 is to track the dialog state in a human-human dialog. For each utterance, the tracker emits a frame of slot-value pairs considering the full history of the dialog up to the current turn. Our system includes an encoder-decoder architecture with an attention mechanism to map an input word sequence to a set of semantic labels, i.e., slot-value pairs. This handles the problem of the unknown alignment between the utterances and the labels. By combining the attention-based tracker with rule-based trackers elaborated for English and Chinese, the F-score for the development set improved from 0.475 to 0.507 compared to the rule-only trackers. Moreover, we achieved 0.517 F-score by refining the combination strategy based on the topic and slot level performance of each tracker. In this paper, we also validate the efficacy of each technique and report the test set results submitted to the challenge.
KW - Attention model
KW - Dialog state tracking
KW - Encoder-decoder
KW - Long short-term memory
KW - Sequence-to-sequence learning
UR - http://www.scopus.com/inward/record.url?scp=85015997608&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015997608&partnerID=8YFLogxK
U2 - 10.1109/SLT.2016.7846317
DO - 10.1109/SLT.2016.7846317
M3 - Conference contribution
AN - SCOPUS:85015997608
T3 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
SP - 552
EP - 558
BT - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016
Y2 - 13 December 2016 through 16 December 2016
ER -