
arXiv: 2409.04037
In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic control problems which can all be solved explicitly, and only require to solve recursively linear PDEs in the Markovian case. Though our procedure fails in general to produce a non-decreasing sequence like the standard algorithm, it can be made arbitrarily close to being monotone. More importantly, we recover the standard exponential speed of convergence for both the value and the controls, through purely probabilistic arguments which are significantly simpler than in the classical case. Our proof also accommodates non-Markovian dynamics as well as volatility control, allowing us to obtain the first convergence results in the latter case for a state process in multi-dimensions.
18 pages
Optimization and Control (math.OC), Probability (math.PR), FOS: Mathematics, Mathematics - Optimization and Control, Mathematics - Probability
Optimization and Control (math.OC), Probability (math.PR), FOS: Mathematics, Mathematics - Optimization and Control, Mathematics - Probability
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
