A New Value Iteration method for the Average Cost Dynamic Programming Problem

Name: A New Value Iteration method for the Average Cost Dynamic Programming Problem
Creator: Bertsekas, Dimitri P.
Keywords: dynamic programming, Markov and semi-Markov decision processes, value iteration, Dynamic programming in optimal control and differential games, average cost, 0211 other engineering and technologies, 02 engineering and technology, Programming involving graphs or networks

Bertsekas, Dimitri P.

Found an issue? Give us feedback

SIAM Journal on Cont...arrow_drop_down

SIAM Journal on Control and Optimization

Article

Data sources: UnpayWall

zbMATH Open

Article

Data sources: zbMATH Open

SIAM Journal on Control and Optimization

Article . 1998 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.1137/s03...

Article

Data sources: Microsoft Academic Graph

A New Value Iteration method for the Average Cost Dynamic Programming Problem

A new value iteration method for the average cost dynamic programming problem

descriptionPublicationkeyboard_double_arrow_right Article 01 Mar 1998 English Publisher:Society for Industrial & Applied Mathematics (SIAM)Journal:SIAM Journal on Control and Optimization, volume 36, pages 742-759 (issn: 0363-0129, eissn: 1095-7138,

Copyright policy )

Authors: Bertsekas, Dimitri P.;

doi: 10.1137/s0363012995291609

A New Value Iteration method for the Average Cost Dynamic Programming Problem

- Summary
- Subjects
- Metrics

Abstract

Summary: We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss-Seidel implementation. Computational tests indicate that the Gauss-Seidel version of the new method substantially outperforms the standard method for difficult problems.

Related Organizations

Massachusetts Institute of Technology
United States

Keywords

dynamic programming, Markov and semi-Markov decision processes, value iteration, Dynamic programming in optimal control and differential games, average cost, Programming involving graphs or networks

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	26
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

26

Top 10%

Average

bronze

Fields of Science

engineering and technology

other engineering and technologies

Fields of Science

engineering and technology

other engineering and technologies