publication . Article . Preprint . 2014

Safe Exploration of State and Action Spaces in Reinforcement Learning

Javier García; Fernando Fernández;
Open Access
  • Published: 03 Feb 2014 Journal: Journal of Artificial Intelligence Research, volume 45, pages 515-564 (eissn: 1076-9757, Copyright policy)
  • Publisher: AI Access Foundation
Abstract
<jats:p>In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional challenge is posed by the need for safe and efficient exploration. Traditional exploration techniques are not particularly useful for solving dangerous tasks, where the trial and error process may lead to the selection of actions whose execution in some states may result in damage to the learning system (or any other system). Consequently, when an agent begins an interaction with a dangerous and high-dimensional state-...
Persistent Identifiers
Subjects
free text keywords: Artificial Intelligence, Computer Science - Learning, Computer Science - Artificial Intelligence, Trial and error, Computer science, business.industry, business, Car parking, Reinforcement learning, Business management, Action control, Error-driven learning
Related Organizations
24 references, page 1 of 2

Aamodt, A., & Plaza, E. (1994). Case-Based Reasoning; Foundational Issues, Methodological Variations, and System Approaches. AI Communications, 7 (1), 39{59. [OpenAIRE]

Abbeel, P., Coates, A., Hunter, T., & Ng, A. Y. (2008). Autonomous Autorotation of an RC Helicopter. In ISER, pp. 385{394.

Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. I. J. Robotic Res., 29 (13), 1608{1639. [OpenAIRE]

Abbott, R. G. (2008). Robocup 2007: Robot soccer world cup xi.. chap. Behavioral Cloning for Simulator Validation, pp. 329{336. Springer-Verlag, Berlin, Heidelberg.

Aha, D. W. (1992). Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms. International Journal Man-Machine Studies, 36 (2), 267{287. [OpenAIRE]

Aha, D. W., & Kibler, D. (1991). Instance-based learning algorithms. In Machine Learning, pp. 37{66. [OpenAIRE]

Anderson, C. W., Draper, B. A., & Peterson, D. A. (2000). Behavioral cloning of student pilots with modular neural networks. In Proceedings of the Seventeenth International Conference on Machine Learning, pp. 25{32. Morgan Kaufmann.

Argall, B., Chernova, S., Veloso, M., & Browning, B. (2009). A Survey of Robot Learning from Demonstration. Robotics and Autonomous Systems, 57 (5), 469{483. [OpenAIRE]

Bartsch-Sprl, B., Lenz, M., & Hbner, A. (1999). Case-based reasoning: Survey and future directions.. In Puppe, F. (Ed.), XPS, Vol. 1570 of Lecture Notes in Computer Science, pp. 67{89. Springer. [OpenAIRE]

Bianchi, R., Ros, R., & de Mantaras, R. L. (2009). Improving reinforcement learning by using case-based heuristics.. Vol. 5650, pp. 75{89. Lecture Notes in Arti cial Intelligence, Springer, Lecture Notes in Arti cial Intelligence, Springer.

Borrajo, F., Bueno, Y., de Pablo, I., Santos, B. n., Fernandez, F., Garc a, J., & Sagredo, I. (2010). SIMBA: A Simulator for Business Education and Research. Decission Support Systems, 48 (3), 498{506. [OpenAIRE]

Gabel, T., & Riedmiller, M. (2005). Cbr for state value function approximation in reinforcement learning. In Proceedings of the 6th International Conference on Case-Based Reasoning (ICCBR 2005, pp. 206{221. Springer.

Geibel, P., & Wysotzki, F. (2005). Risk-sensitive Reinforcement Learning Applied to Control under Constraints. Journal of Arti cial Intelligence Research (JAIR), 24, 81{108.

Hu, H., Kostiadis, K., Hunter, M., & Kalyviotis, N. (2001). Essex wizards 2001 team description. In Birk, A., Coradeschi, S., & Tadokoro, S. (Eds.), RoboCup, Vol. 2377 of Lecture Notes in Computer Science, pp. 511{514. Springer.

Martin H, J., & de Lope, J. (2009). Exa: An e ective algorithm for continuous actions reinforcement learning problems. In Industrial Electronics, 2009. IECON '09. 35th Annual Conference of IEEE, pp. 2063 {2068.

24 references, page 1 of 2
Any information missing or wrong?Report an Issue