
Approximate dynamic programming (ADP) is the standard tool to solve Markovian decision problems under general hypotheses on the system and the cost equations. It is known that one of the key issues of the procedure is how to generate an efficient sampling of the state space, needed for the approximation of the value function, in order to cope with the well-known phenomenon of the curse of dimensionality. The most common approaches in the literature are either aimed at a uniform covering of the state space or driven by the actual evolution of the system trajectories. Concerning the latter approach, F -discrepancy, a quantity closely related to the Kolmogorov-Smirnov statistic, that measures how strictly a set of random points represents a probability distribution, has been recently proposed for an efficient ADP framework in the finite-horizon case. In this paper, we extend this framework to infinite-horizon discounted problems, providing a constructive algorithm to generate efficient sampling points driven by the system behavior. Then, the algorithm is refined with the aim of acquiring a more balanced covering of the state space, thus addressing possible drawbacks of a pure system-driven sampling approach to obtain, in fact, an efficient hybrid between the latter and the pure uniform design. A theoretical analysis is provided through the introduction of an original notion of the F -discrepancy and the proof of its properties. Simulation tests are provided to showcase the behavior of the proposed sampling method.
F -discrepancy, state sampling, approximate dynamic programming, Markovian decision problem
F -discrepancy, state sampling, approximate dynamic programming, Markovian decision problem
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 6 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
