publication . Preprint . Conference object . 2017

A Benchmark Environment Motivated by Industrial Control Problems

Daniel Hein; Stefan Depeweg; Michel Tokic; Steffen Udluft; Alexander Hentschel; Thomas A. Runkler; Volkmar Sterzing;
Open Access English
  • Published: 27 Sep 2017
Abstract
In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into th...
Subjects
free text keywords: Computer Science - Artificial Intelligence, Computer Science - Learning, Computer Science - Systems and Control, Machine learning, computer.software_genre, computer, Reinforcement learning, Computer science, Java, computer.programming_language, Wind power, business.industry, business, Python (programming language), Benchmark (computing), Artificial intelligence, Software
Related Organizations
20 references, page 1 of 2

[1] M. Schlang, B. Feldkeller, B. Lang, P. T., and R. T. A., “Neural computation in steel industry,” in 1999 European Control Conference (ECC), 1999, pp. 2922-2927.

[2] T. A. Runkler, E. Gerstorfer, M. Schlang, E. Ju¨nnemann, and J. Hollatz, “Modelling and optimisation of a refining process for fibre board production,” Control engineering practice, vol. 11, no. 11, pp. 1229- 1241, 2003.

[3] S. A. Hartmann and T. A. Runkler, “Online optimization of a color sorting assembly buffer using ant colony optimization,” Operations Research Proceedings 2007, pp. 415-420, 2008. [OpenAIRE]

[4] A. M. Schaefer, D. Schneegass, V. Sterzing, and S. Udluft, “A neural reinforcement learning approach to gas turbine control,” in 2007 International Joint Conference on Neural Networks, 2007, pp. 1691-1696.

[5] A. Hans, D. Schneegass, A. M. Schaefer, and S. Udluft, “Safe exploration for reinforcement learning,” in 2008 European Symposium on Artificial Neural Networks (ESANN), 2008, pp. 143-148.

[6] M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” Journal of Artificial Intelligence Research, vol. 47, pp. 253-279, 2013. [OpenAIRE]

[7] E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2012, pp. 5026-5033.

[8] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.

[9] S. Lange, T. Gabel, and M. Riedmiller, “Batch reinforcement learning,” in Reinforcement Learning. Springer, 2012, pp. 45-73.

[10] D. Hein, S. Udluft, M. Tokic, A. Hentschel, T. A. Runkler, and V. Sterzing, “Batch reinforcement learning on the industrial benchmark: First experiences,” in 2017 International Joint Conference on Neural Networks (IJCNN), 2017, pp. 4214-4221. [OpenAIRE]

[11] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.

[12] P. Abbeel, A. Coates, and A. Y. Ng, “Autonomous helicopter aerobatics through apprenticeship learning,” The International Journal of Robotics Research, vol. 29, no. 13, pp. 1608-1639, 2010.

[13] J. Randløv and P. Alstrøm, “Learning to drive a bicycle using reinforcement learning and shaping,” in Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), J. W. Shavlik, Ed. San Francisco, CA, USA: Morgan Kauffman, 1998, pp. 463-471.

[14] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529-533, 2015.

[15] H. van Seijen, M. Fatemi, J. Romoff, R. Laroche, T. Barnes, and J. Tsang, “Hybrid reward architecture for reinforcement learning,” arXiv preprint arXiv:1706.04208, 2017. [OpenAIRE]

20 references, page 1 of 2
Abstract
In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into th...
Subjects
free text keywords: Computer Science - Artificial Intelligence, Computer Science - Learning, Computer Science - Systems and Control, Machine learning, computer.software_genre, computer, Reinforcement learning, Computer science, Java, computer.programming_language, Wind power, business.industry, business, Python (programming language), Benchmark (computing), Artificial intelligence, Software
Related Organizations
20 references, page 1 of 2

[1] M. Schlang, B. Feldkeller, B. Lang, P. T., and R. T. A., “Neural computation in steel industry,” in 1999 European Control Conference (ECC), 1999, pp. 2922-2927.

[2] T. A. Runkler, E. Gerstorfer, M. Schlang, E. Ju¨nnemann, and J. Hollatz, “Modelling and optimisation of a refining process for fibre board production,” Control engineering practice, vol. 11, no. 11, pp. 1229- 1241, 2003.

[3] S. A. Hartmann and T. A. Runkler, “Online optimization of a color sorting assembly buffer using ant colony optimization,” Operations Research Proceedings 2007, pp. 415-420, 2008. [OpenAIRE]

[4] A. M. Schaefer, D. Schneegass, V. Sterzing, and S. Udluft, “A neural reinforcement learning approach to gas turbine control,” in 2007 International Joint Conference on Neural Networks, 2007, pp. 1691-1696.

[5] A. Hans, D. Schneegass, A. M. Schaefer, and S. Udluft, “Safe exploration for reinforcement learning,” in 2008 European Symposium on Artificial Neural Networks (ESANN), 2008, pp. 143-148.

[6] M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” Journal of Artificial Intelligence Research, vol. 47, pp. 253-279, 2013. [OpenAIRE]

[7] E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2012, pp. 5026-5033.

[8] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.

[9] S. Lange, T. Gabel, and M. Riedmiller, “Batch reinforcement learning,” in Reinforcement Learning. Springer, 2012, pp. 45-73.

[10] D. Hein, S. Udluft, M. Tokic, A. Hentschel, T. A. Runkler, and V. Sterzing, “Batch reinforcement learning on the industrial benchmark: First experiences,” in 2017 International Joint Conference on Neural Networks (IJCNN), 2017, pp. 4214-4221. [OpenAIRE]

[11] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.

[12] P. Abbeel, A. Coates, and A. Y. Ng, “Autonomous helicopter aerobatics through apprenticeship learning,” The International Journal of Robotics Research, vol. 29, no. 13, pp. 1608-1639, 2010.

[13] J. Randløv and P. Alstrøm, “Learning to drive a bicycle using reinforcement learning and shaping,” in Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), J. W. Shavlik, Ed. San Francisco, CA, USA: Morgan Kauffman, 1998, pp. 463-471.

[14] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529-533, 2015.

[15] H. van Seijen, M. Fatemi, J. Romoff, R. Laroche, T. Barnes, and J. Tsang, “Hybrid reward architecture for reinforcement learning,” arXiv preprint arXiv:1706.04208, 2017. [OpenAIRE]

20 references, page 1 of 2
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Preprint . Conference object . 2017

A Benchmark Environment Motivated by Industrial Control Problems

Daniel Hein; Stefan Depeweg; Michel Tokic; Steffen Udluft; Alexander Hentschel; Thomas A. Runkler; Volkmar Sterzing;