Downloads provided by UsageCounts
handle: 2117/10368 , 10261/30153
The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algorithms to work. The well known curse of dimensionality makes infeasible using a tabular representation of the value function, which is the classical approach that provides convergence guarantees. When a function approximation technique is used to generalize among similar states, the convergence of the algorithm is compromised, since updates unavoidably affect an extended region of the domain, that is, some situations are modified in a way that has not been really experienced, and the update may degrade the approximation. We propose a RL algorithm that uses a probability density estimation in the joint space of states, actions and Q-values as a means of function approximation. This allows us to devise an updating approach that, taking into account the local sampling density, avoids an excessive modification of the approximation far from the observed sample.
This work was supported by the project 'CONSOLIDER-INGENIO 2010 Multimodal interaction in pattern recognition and computer vision' (V-00069). This research was partially supported by Consolider Ingenio 2010, project CSD2007-00018.
Presentado al ICINCO 2010 celebrado en Funchal (Portugal) del 15 al 18 de junio.
Peer Reviewed
:Informàtica::Intel·ligència artificial::Aprenentatge automàtic [Àrees temàtiques de la UPC], Reinforcement learning, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic, Machine learning, Aprenentatge automàtic, Classificació INSPEC::Cybernetics::Artificial intelligence::Learning (artificial intelligence), generalisation (artificial intelligence) intelligent robots learning (artificial intelligence), :Cybernetics::Artificial intelligence::Learning (artificial intelligence) [Classificació INSPEC]
:Informàtica::Intel·ligència artificial::Aprenentatge automàtic [Àrees temàtiques de la UPC], Reinforcement learning, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic, Machine learning, Aprenentatge automàtic, Classificació INSPEC::Cybernetics::Artificial intelligence::Learning (artificial intelligence), generalisation (artificial intelligence) intelligent robots learning (artificial intelligence), :Cybernetics::Artificial intelligence::Learning (artificial intelligence) [Classificació INSPEC]
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 72 | |
| downloads | 67 |

Views provided by UsageCounts
Downloads provided by UsageCounts