publication . Conference object . Preprint . 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Das, Abhishek; Kottur, Satwik; Moura, José M. F.; Lee, Stefan; Batra, Dhruv;
Open Access
  • Published: 19 Mar 2017
  • Publisher: IEEE
Abstract
Comment: 11 pages, 4 figures, 2 tables, webpage: http://visualdialog.org/
Subjects
free text keywords: Computer science, Artificial intelligence, business.industry, business, Computer vision, Visualization, Dialog box, Dialog system, computer.software_genre, computer, Question answering, Ask price, Supervised learning, Reinforcement learning, Natural language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Learning
37 references, page 1 of 3

[1] S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual Question Answering. In ICCV, 2015. 1, 2, 3

[2] J. P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R. C. Miller, R. Miller, A. Tatarowicz, B. White, S. White, and T. Yeh. VizWiz: Nearly Real-time Answers to Visual Questions. In UIST, 2010. 1

[3] X. Chen and C. L. Zitnick. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. In CVPR, 2015. 1

[4] A. Das, S. Kottur, K. Gupta, A. Singh, D. Yadav, J. M. Moura, D. Parikh, and D. Batra. Visual Dialog. In CVPR, 2017. 1, 2, 3, 4, 7, 8, 10

[5] H. de Vries, F. Strub, S. Chandar, O. Pietquin, H. Larochelle, and A. Courville. GuessWhat?! visual object discovery through multi-modal dialogue. In CVPR, 2017. 1, 2, 3 [OpenAIRE]

[6] J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. In CVPR, 2015. 3

[7] H. Fang, S. Gupta, F. N. Iandola, R. K. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. C. Platt, C. L. Zitnick, and G. Zweig. From Captions to Visual Concepts and Back. In CVPR, 2015. 3

[8] J. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems, 2016. 3

[9] H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering. In NIPS, 2015. 3

[10] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative Adversarial Nets. In NIPS, 2014. 3

[11] S. Havrylov and I. Titov. Emergence of language with multiagent games: Learning to communicate with sequences of symbols. In ICLR Workshop, 2017. 3

[12] J. Johnson, A. Karpathy, and L. Fei-Fei. DenseCap: Fully Convolutional Localization Networks for Dense Captioning. In CVPR, 2016. 1

[13] A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015. 3, 8

[14] S. Kazemzadeh, V. Ordonez, M. Matten, and T. L. Berg. ReferItGame: Referring to Objects in Photographs of Natural Scenes. In EMNLP, 2014. 3

[15] D. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In ICLR, 2015. 8

37 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Conference object . Preprint . 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Das, Abhishek; Kottur, Satwik; Moura, José M. F.; Lee, Stefan; Batra, Dhruv;