Variance Optimization for Continuous-Time Markov Decision Processes

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2019Publisher:Scientific Research Publishing, Inc.Journal:Open Journal of Statistics, volume 9, pages 181-195 (issn: 2161-718X, eissn: 2161-7198,

Copyright policy )

Authors: Yaqing Fu;

doi: 10.4236/ojs.2019.92014

Variance Optimization for Continuous-Time Markov Decision Processes

- Summary
- Metrics

Abstract

This paper considers the variance optimization problem of average reward in continuous-time Markov decision process (MDP). It is assumed that the state space is countable and the action space is Borel measurable space. The main purpose of this paper is to find the policy with the minimal variance in the deterministic stationary policy space. Unlike the traditional Markov decision process, the cost function in the variance criterion will be affected by future actions. To this end, we convert the variance minimization problem into a standard (MDP) by introducing a concept called pseudo-variance. Further, by giving the policy iterative algorithm of pseudo-variance optimization problem, the optimal policy of the original variance optimization problem is derived, and a sufficient condition for the variance optimal policy is given. Finally, we use an example to illustrate the conclusion of this paper.

Related Organizations

University of Jinan
China (People's Republic of)
Jinan University
China (People's Republic of)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

gold

Fields of Science (3) View all

engineering and technology

industrial biotechnology

Fields of Science

engineering and technology

industrial biotechnology

View all