Reinforcement learning for robot control using probability density estimations

descriptionPublicationkeyboard_double_arrow_right Conference object 01 Jan 2010 Spain Publisher:INSTICC Press. Institute for Systems and Technologies of Information, Control and Communication

Authors: Agostini, Alejandro Gabriel; Celaya Llover, Enric;

handle: 2117/10368 , 10261/30153

Reinforcement learning for robot control using probability density estimations

- Summary
- Subjects
- Metrics

Abstract

The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algorithms to work. The well known curse of dimensionality makes infeasible using a tabular representation of the value function, which is the classical approach that provides convergence guarantees. When a function approximation technique is used to generalize among similar states, the convergence of the algorithm is compromised, since updates unavoidably affect an extended region of the domain, that is, some situations are modified in a way that has not been really experienced, and the update may degrade the approximation. We propose a RL algorithm that uses a probability density estimation in the joint space of states, actions and Q-values as a means of function approximation. This allows us to devise an updating approach that, taking into account the local sampling density, avoids an excessive modification of the approximation far from the observed sample.

This work was supported by the project 'CONSOLIDER-INGENIO 2010 Multimodal interaction in pattern recognition and computer vision' (V-00069). This research was partially supported by Consolider Ingenio 2010, project CSD2007-00018.

Presentado al ICINCO 2010 celebrado en Funchal (Portugal) del 15 al 18 de junio.

Peer Reviewed

Country

Spain

Related Organizations

Universitat Polite`cnica de Catalunya
Spain
Spanish National Research Council
Spain

Keywords

:Informàtica::Intel·ligència artificial::Aprenentatge automàtic [Àrees temàtiques de la UPC], Reinforcement learning, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic, Machine learning, Aprenentatge automàtic, Classificació INSPEC::Cybernetics::Artificial intelligence::Learning (artificial intelligence), generalisation (artificial intelligence) intelligent robots learning (artificial intelligence), :Cybernetics::Artificial intelligence::Learning (artificial intelligence) [Classificació INSPEC]

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average