Dialog State Tracking with Attention-based Sequence-to-sequence Learning

We present an advanced dialog state tracking system designed for the 5th Dialog State Tracking Challenge (DSTC5). The main task of DSTC5 is to track the dialog state in a human-human dialog. For each utterance, the tracker emits a frame of slot-value pairs considering the full history of the dialog up to the current turn. Our system includes an encoder-decoder architecture with an attention mechanism to map an input word sequence to a set of semantic labels, i.e., slot-value pairs. This handles the problem of the unknown alignment between the utterances and the labels. By combining the attentionbased tracker with rule-based trackers elaborated for English and Chinese, the F-score for the development set improved from 0.475 to 0.507 compared to the rule-only trackers. Moreover, we achieved 0.517 F-score by refining the combination strategy based on the topic and slot level performance of each tracker. In this paper, we also validate the efficacy of each technique and report the test set results submitted to the challenge.