Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening

Preprint English OPEN
He, Frank S.; Liu, Yang; Schwing, Alexander G.; Peng, Jian;
(2016)
  • Subject: Statistics - Machine Learning | Computer Science - Learning

We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation. Our novel technique makes deep reinforcement learning ... View more
  • References (30)
    30 references, page 1 of 3

    M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The arcade learning environment: An evaluation platform for general agents. J. of Artificial Intelligence Research, 2013.

    Y. Bengio, A. Courville, and P. Vincent. Representation Learning: A Review and New Perspectives. PAMI, 2013.

    D. P. Bertsekas and J. N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996.

    C. Blundell, B. Uria, A. Pritzel, Y. Li, A. Ruderman, J. Z. Leibo, J. Rae, D. Wierstra, and D. Hassabis. ModelFree Episodic Control. In http://arxiv.org/pdf/1606.04460v1.pdf, 2016.

    G. E. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 2012.

    L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. JMLR, 1996.

    A. Krizhevsky, I. Sutskever, , and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proc. NIPS, 2012.

    S. Lange and M. Riedmiller. Deep auto-encoder neural networks in reinforcement learning. In Proc. Int. Jt. Conf. Neural. Netw., 2010.

    Y. LeCun, Y. Bengio, and G. E. Hinton. Deep learning. Nature, 2015.

    L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 1992.

  • Related Organizations (4)
  • Metrics
Share - Bookmark