Approximate Policy Iteration for Semi-Markov Control Revisited

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Jan 2011 English Publisher:Elsevier BVJournal:Procedia Computer Science, volume 6, pages 249-255 (issn: 1877-0509,

Copyright policy )

Authors: Abhijit Gosavi;

doi: 10.1016/j.procs.2011.08.046

Approximate Policy Iteration for Semi-Markov Control Revisited

- Summary
- Subjects
- Metrics

Abstract

AbstractThe semi-Markov decision process can be solved via reinforcement learning without generating its transition model. We briefly review the existing algorithms based on approximate policy iteration (API) for solving this problem for discounted and average reward under the infinite horizon. API techniques have attracted significant interest in the literature recently. We first present and analyze an extension of an existing API algorithm for discounted reward that can handle continuous reward rates. Then, we also consider its average reward counterpart, which requires an updating based on the stochastic shortest path (SSP). We study the convergence properties of the algorithm that does not require the SSP update.

Related Organizations

Missouri University of Science and Technology
United States

Keywords

reinforcement learning, average reward, Semi-Markov, approximate policy iteration

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

Top 10%

Average

gold

Fields of Science (3) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all