publication . Article . Conference object . Preprint . 2010

kalman temporal differences

Geist, Matthieu; Pietquin, Olivier;
Open Access
  • Published: 01 Oct 2010 Journal: Journal of Artificial Intelligence Research, volume 39, pages 483-532 (eissn: 1076-9757, Copyright policy)
  • Publisher: AI Access Foundation
  • Country: France
Abstract
International audience; Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman Temporal Differences (KTD) framework, that exhibits the following features: sample-efficiency, non-linear approximation, non-stationarity handling and uncertainty management. A first KTD-based algorithm is provided for deterministic Markov Decision Processes (MDP) which produces biased estimates in the case of stochastic transitions. Than the eXtended KTD framework (XKTD), solving stochastic MDP, is des...
Subjects
free text keywords: Artificial Intelligence, Function approximation, Machine learning, computer.software_genre, computer, Convergence (routing), Kalman filter, business.industry, business, Markov decision process, Computer science, Mathematical optimization, Reinforcement learning, Scalability, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], Computer Science - Learning
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue