Name: Finding safe policies in model-based active learning
Keywords: Contemporary Art, Spanish female artists, Àrees temàtiques de la UPC::Informàtica::Robòtica, Toxicología aguda, Pavilion 1937, :Automation::Robots::Intelligent robots [Classificació INSPEC], uncertainty handling., Galactosilación, Gender, intelligent robots

descriptionPublicationkeyboard_double_arrow_right Conference object , External research report , Article , Other ORP type , Other literature type 01 Jan 2014Publisher:Federación del Gremio de Editores de EspañaFunded by:EC | INTELLACT

Authors: Diezma Diaz, Carlos; Jiménez, A.; Re, M.; Benavides, Julio; Rojo, S.; Román, A.; Gutiérrez, D.; +8 Authors

doi: 10.13039/501100003176

handle: 10261/127587 , 10261/115374 , 2117/27799 , 10261/221579 , 10261/239773 , 10261/181496 , 10261/162877

Finding safe policies in model-based active learning

- Summary
- Subjects
- Related research
  (8)
- Metrics

Abstract

Task learning in robotics is a time-consuming process, and model-based reinforcement learning algorithms have been proposed to learn with just a small amount of experiences. However, reducing the number of experiences used to learn implies that the algorithm may overlook crucial actions required to get an optimal behavior. For example, a robot may learn simple policies that have a high risk of not reaching the goal because they often fall into dead-ends. We propose a new method that allows the robot to reason about dead-ends and their causes. Analyzing its current model and experiences, the robot will hypothesize the possible causes for the dead-end, and identify the actions that may cause it, marking them as dangerous. Afterwards, whenever a dangerous action is included into a plan which has a high risk of leading to a dead-end, the special action request teacher confirmation will be triggered by the robot to actively confirm with a teacher that the planned risky action should be executed. This method permits learning safer policies with the addition of just a few teacher demonstration requests. Experimental validation of the approach is provided in two different scenarios: a robotic assembly task and a domain from the international planning competition. Our approach gets success ratios very close to 1 in problems where previous approaches had high probabilities of reaching dead-ends.

This work was supported by EU Project IntellAct FP7-ICT2009-6-269959, by CSIC project MANIPlus 201350E102 and by the Spanish Ministry of Science and Innovation under project PAU+ DPI2011-27510. D. Martínez is also supported by the Spanish Ministry of Education, Culture and Sport via a FPU doctoral grant (FPU12-04173).

Trabajo presentado al IROS: "Machine Learning in Planning and Control of Robot Motion Workshop" (IROS MLPC), celebrado en Chicago, Illinois (US) del 14 al 18 de septiembre.

Este ítem (excepto textos e imágenes no creados por el autor) está sujeto a una licencia de Creative Commons: Attribution-NonCommercial-NoDerivs 3.0 Spain.

Peer Reviewed

Related Organizations

Bowdoin College
United States
Universitat Polite`cnica de Catalunya
Spain
Spanish National Research Council
Spain
Universitat Politècnica de Catalunya
Spain

Keywords

Contemporary Art, Spanish female artists, Àrees temàtiques de la UPC::Informàtica::Robòtica, Toxicología aguda, Pavilion 1937, :Automation::Robots::Intelligent robots [Classificació INSPEC], uncertainty handling., Galactosilación, Gender, intelligent robots, Spanish Literature, Reacción de Maillard, Classificació INSPEC::Automation::Robots::Intelligent robots, and Sexuality Studies, Art and feminism, Women's Studies, art and feminism, Caseinato sódico, Spanish Republic, learning (artificial intelligence), :Informàtica::Robòtica [Àrees temàtiques de la UPC], Other Feminist, planning (artificial intelligence)

8 Research products, page 1 of 1

Identidades artísticas en tránsito. Introducción
2019IsAmongTopNSimilarDocuments
Imaginarios en conflicto: “lo español” en los siglos XIX y XX
2017IsAmongTopNSimilarDocuments
Introducción: Visiones imperiales y profecía. Roma, España, Nuevo Mundo
2013IsAmongTopNSimilarDocuments
Arte en el Real Jardín Botánico: Patrimonio, memoria y creación
2016IsAmongTopNSimilarDocuments
Alma Tapia: la línea moderna
2017IsAmongTopNSimilarDocuments
Identidades y tránsitos artísticos en el exilio español de 1939 hacia Latinoamérica
2019IsAmongTopNSimilarDocuments
Introducción. Imaginarios en conflicto: Lo español en los siglos XIX y XX
2017IsAmongTopNSimilarDocuments
Los artistas del exilio español de 1939 en México. Caracterización y panorama
2015IsAmongTopNSimilarDocuments

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average