publication . Preprint . 2017

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning

Peng, Baolin; Li, Xiujun; Gao, Jianfeng; Liu, Jingjing; Chen, Yun-Nung; Wong, Kam-Fai;
Open Access English
  • Published: 30 Oct 2017
This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems. Inspired by generative adversarial networks (GAN), we train a discriminator to differentiate responses/actions generated by dialogue agents from responses/actions by experts. Then, we incorporate the discriminator as another critic into the advantage actor-critic (A2C) framework, to encourage the dialogue agent to explore state-action within the regions where the agent takes actions similar to those of the experts. Experimental results in a movie-ticket booking doma...
free text keywords: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Learning
