Downloads provided by UsageCounts
handle: 10261/127587 , 10261/115374 , 2117/27799 , 10261/221579 , 10261/239773 , 10261/181496 , 10261/162877
handle: 10261/127587 , 10261/115374 , 2117/27799 , 10261/221579 , 10261/239773 , 10261/181496 , 10261/162877
Task learning in robotics is a time-consuming process, and model-based reinforcement learning algorithms have been proposed to learn with just a small amount of experiences. However, reducing the number of experiences used to learn implies that the algorithm may overlook crucial actions required to get an optimal behavior. For example, a robot may learn simple policies that have a high risk of not reaching the goal because they often fall into dead-ends. We propose a new method that allows the robot to reason about dead-ends and their causes. Analyzing its current model and experiences, the robot will hypothesize the possible causes for the dead-end, and identify the actions that may cause it, marking them as dangerous. Afterwards, whenever a dangerous action is included into a plan which has a high risk of leading to a dead-end, the special action request teacher confirmation will be triggered by the robot to actively confirm with a teacher that the planned risky action should be executed. This method permits learning safer policies with the addition of just a few teacher demonstration requests. Experimental validation of the approach is provided in two different scenarios: a robotic assembly task and a domain from the international planning competition. Our approach gets success ratios very close to 1 in problems where previous approaches had high probabilities of reaching dead-ends.
This work was supported by EU Project IntellAct FP7-ICT2009-6-269959, by CSIC project MANIPlus 201350E102 and by the Spanish Ministry of Science and Innovation under project PAU+ DPI2011-27510. D. Martínez is also supported by the Spanish Ministry of Education, Culture and Sport via a FPU doctoral grant (FPU12-04173).
Trabajo presentado al IROS: "Machine Learning in Planning and Control of Robot Motion Workshop" (IROS MLPC), celebrado en Chicago, Illinois (US) del 14 al 18 de septiembre.
Este ítem (excepto textos e imágenes no creados por el autor) está sujeto a una licencia de Creative Commons: Attribution-NonCommercial-NoDerivs 3.0 Spain.
Peer Reviewed
Contemporary Art, Spanish female artists, Àrees temàtiques de la UPC::Informàtica::Robòtica, Toxicología aguda, Pavilion 1937, :Automation::Robots::Intelligent robots [Classificació INSPEC], uncertainty handling., Galactosilación, Gender, intelligent robots, Spanish Literature, Reacción de Maillard, Classificació INSPEC::Automation::Robots::Intelligent robots, and Sexuality Studies, Art and feminism, Women's Studies, art and feminism, Caseinato sódico, Spanish Republic, learning (artificial intelligence), :Informàtica::Robòtica [Àrees temàtiques de la UPC], Other Feminist, planning (artificial intelligence)
Contemporary Art, Spanish female artists, Àrees temàtiques de la UPC::Informàtica::Robòtica, Toxicología aguda, Pavilion 1937, :Automation::Robots::Intelligent robots [Classificació INSPEC], uncertainty handling., Galactosilación, Gender, intelligent robots, Spanish Literature, Reacción de Maillard, Classificació INSPEC::Automation::Robots::Intelligent robots, and Sexuality Studies, Art and feminism, Women's Studies, art and feminism, Caseinato sódico, Spanish Republic, learning (artificial intelligence), :Informàtica::Robòtica [Àrees temàtiques de la UPC], Other Feminist, planning (artificial intelligence)
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 247 | |
| downloads | 1K |

Views provided by UsageCounts
Downloads provided by UsageCounts