publication . Preprint . Part of book or chapter of book . Other literature type . 2018

Deep Reinforcement Learning: An Overview

Mousavi, Seyed Sajad; Schukat, Michael; Howley, Enda;
Open Access English
  • Published: 22 Jun 2018
Abstract
Comment: Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update
Subjects
free text keywords: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning, Artificial intelligence, business.industry, business, Raw data, Artificial neural network, Reinforcement learning, Machine learning, computer.software_genre, computer, Recurrent neural network, Mathematics, Convolutional neural network, Deep learning
Related Organizations
42 references, page 1 of 3

[1] Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J., 2003. A robot that reinforcement-learns to identify and memorize important previous observations. In: Intelligent Robots and Systems, 2003.(IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference on. Vol. 1. IEEE, pp. 430{435. [OpenAIRE]

[2] Bellemare, M. G., Naddaf, Y., Veness, J., Bowling, M., 2013. The arcade learning environment: An evaluation platform for general agents. J. Artif. Intell. Res.(JAIR) 47, 253{279. 14 [OpenAIRE]

[3] Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35 (8), 1798{1828.

[4] Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., 2007. Greedy layer-wise training of deep networks. In: Advances in neural information processing systems. pp. 153{160.

[5] Bengio, Y., Simard, P., Frasconi, P., 1994. Learning long-term dependencies with gradient descent is di cult. IEEE transactions on neural networks 5 (2), 157{166. [OpenAIRE]

[6] Bengio, Y., et al., 2009. Learning deep architectures for ai. Foundations and trends R in Machine Learning 2 (1), 1{127.

[7] Beyer, H.-G., Schwefel, H.-P., 2002. Evolution strategies{a comprehensive introduction. Natural computing 1 (1), 3{52.

[8] Bohmer, W., Springenberg, J. T., Boedecker, J., Riedmiller, M., Obermayer, K., 2015. Autonomous learning of state representations for control: An emerging eld aims to autonomously learn state representations for reinforcement learning agents from their real-world sensor observations. KI-Kunstliche Intelligenz 29 (4), 353{362.

[9] Clark, C., Storkey, A., 2015. Training deep convolutional neural networks to play go. In: International Conference on Machine Learning. pp. 1766{1774. [OpenAIRE]

[10] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L., 2009. Imagenet: A large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, pp. 248{255.

[11] Deng, L., Hinton, G., Kingsbury, B., 2013. New types of deep neural network learning for speech recognition and related applications: An overview. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, pp. 8599{8603.

[12] Gruttner, M., Sehnke, F., Schaul, T., Schmidhuber, J., 2010. Multi-dimensional deep memory atarigo players for parameter exploring policy gradients. In: International Conference on Arti cial Neural Networks. Springer, pp. 114{123. [OpenAIRE]

[13] Guo, X., Singh, S., Lee, H., Lewis, R. L., Wang, X., 2014. Deep learning for real-time atari game play using o ine monte-carlo tree search planning. In: Advances in neural information processing systems. pp. 3338{3346.

[14] Hausknecht, M., Stone, P., 2015. Deep recurrent q-learning for partially observable mdps. CoRR, abs/1507.06527.

[15] Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., et al., 2001. Gradient ow in recurrent nets: the di culty of learning long-term dependencies.

42 references, page 1 of 3
Abstract
Comment: Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update
Subjects
free text keywords: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning, Artificial intelligence, business.industry, business, Raw data, Artificial neural network, Reinforcement learning, Machine learning, computer.software_genre, computer, Recurrent neural network, Mathematics, Convolutional neural network, Deep learning
Related Organizations
42 references, page 1 of 3

[1] Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J., 2003. A robot that reinforcement-learns to identify and memorize important previous observations. In: Intelligent Robots and Systems, 2003.(IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference on. Vol. 1. IEEE, pp. 430{435. [OpenAIRE]

[2] Bellemare, M. G., Naddaf, Y., Veness, J., Bowling, M., 2013. The arcade learning environment: An evaluation platform for general agents. J. Artif. Intell. Res.(JAIR) 47, 253{279. 14 [OpenAIRE]

[3] Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35 (8), 1798{1828.

[4] Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., 2007. Greedy layer-wise training of deep networks. In: Advances in neural information processing systems. pp. 153{160.

[5] Bengio, Y., Simard, P., Frasconi, P., 1994. Learning long-term dependencies with gradient descent is di cult. IEEE transactions on neural networks 5 (2), 157{166. [OpenAIRE]

[6] Bengio, Y., et al., 2009. Learning deep architectures for ai. Foundations and trends R in Machine Learning 2 (1), 1{127.

[7] Beyer, H.-G., Schwefel, H.-P., 2002. Evolution strategies{a comprehensive introduction. Natural computing 1 (1), 3{52.

[8] Bohmer, W., Springenberg, J. T., Boedecker, J., Riedmiller, M., Obermayer, K., 2015. Autonomous learning of state representations for control: An emerging eld aims to autonomously learn state representations for reinforcement learning agents from their real-world sensor observations. KI-Kunstliche Intelligenz 29 (4), 353{362.

[9] Clark, C., Storkey, A., 2015. Training deep convolutional neural networks to play go. In: International Conference on Machine Learning. pp. 1766{1774. [OpenAIRE]

[10] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L., 2009. Imagenet: A large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, pp. 248{255.

[11] Deng, L., Hinton, G., Kingsbury, B., 2013. New types of deep neural network learning for speech recognition and related applications: An overview. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, pp. 8599{8603.

[12] Gruttner, M., Sehnke, F., Schaul, T., Schmidhuber, J., 2010. Multi-dimensional deep memory atarigo players for parameter exploring policy gradients. In: International Conference on Arti cial Neural Networks. Springer, pp. 114{123. [OpenAIRE]

[13] Guo, X., Singh, S., Lee, H., Lewis, R. L., Wang, X., 2014. Deep learning for real-time atari game play using o ine monte-carlo tree search planning. In: Advances in neural information processing systems. pp. 3338{3346.

[14] Hausknecht, M., Stone, P., 2015. Deep recurrent q-learning for partially observable mdps. CoRR, abs/1507.06527.

[15] Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., et al., 2001. Gradient ow in recurrent nets: the di culty of learning long-term dependencies.

42 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue