A policy iteration algorithm for non-Markovian control problems

Name: A policy iteration algorithm for non-Markovian control problems
Keywords: Optimization and Control (math.OC), Probability (math.PR), FOS: Mathematics, Mathematics - Optimization and Control, Mathematics - Probability

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2024Embargo end date: 01 Jan 2024Publisher:arXivFunded by:NSF | CAREER: A new form of pro...

Authors: Possamaï, Dylan; Tangpi, Ludovic;

doi: 10.48550/arxiv.2409.04037

arXiv: 2409.04037

A policy iteration algorithm for non-Markovian control problems

- Summary
- Subjects
- Metrics

Abstract

In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic control problems which can all be solved explicitly, and only require to solve recursively linear PDEs in the Markovian case. Though our procedure fails in general to produce a non-decreasing sequence like the standard algorithm, it can be made arbitrarily close to being monotone. More importantly, we recover the standard exponential speed of convergence for both the value and the controls, through purely probabilistic arguments which are significantly simpler than in the classical case. Our proof also accommodates non-Markovian dynamics as well as volatility control, allowing us to obtain the first convergence results in the latter case for a state process in multi-dimensions.

18 pages

Related Organizations

Princeton University
United States
ETH Zurich
Switzerland
ETH-Zurich
Switzerland
College of New Jersey
United States

Keywords

Optimization and Control (math.OC), Probability (math.PR), FOS: Mathematics, Mathematics - Optimization and Control, Mathematics - Probability

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Funded by

NSF| CAREER: A new form of propagation of chaos and its applications to large population games and risk management