A Survey of Multi-Objective Sequential Decision-Making

Preprint, Article English OPEN
Roijers, D.M. ; Vamplew, P. ; Whiteson, S. ; Dazeley, R. (2013)

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.
  • References (74)
    74 references, page 1 of 8

    Aberdeen, D., Thi´ebaux, S., & Zhang, L. (2004). Decision-theoretic military operations planning. In Proc. ICAPS, Vol. 14, pp. 402-411.

    Aissani, N., Beldjilali, B., & Trentesaux, D. (2008). Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents. In MOSIM'08: 7th Conference Internationale de Modelisation et Simulation, pp. 698-707.

    Aissani, N., Beldjilali, B., & Trentesaux, D. (2009). Dynamic scheduling of maintenance tasks in the pretroleum industry: A reinforcement approach. Engineering Applications of Artificial Intelligence, 22, 1089-1103.

    Aoki, K., Kimura, H., & Kobayashi, S. (2004). Distributed reinforcement learning using bi-directional decision making for multi-criteria control of multi-stage flow systems. In The 8th Conference on Intelligent Autonomous Systems, Vol. 2004.03, pp. 281-290.

    Barrett, L., & Narayanan, S. (2008). Learning all optimal policies with multiple criteria. In Proceedings of the 25th International Conference on Machine Learning, pp. 41-47, New York, NY, USA. ACM.

    Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. V. (2003). Transition-Independent Decentralized Markov Decision Processes. In Proc. of the 2nd Int'l Joint Conf. on Autonomous Agents & Multi-Agent Systems.

    Bellman, R. E. (1957a). A Markov decision process. Journal of Mathematical Mech., 6, 679-684.

    Bhattacharya, B., Lobbrecht, A. H., & Solomantine, D. P. (2003). Neural networks and reinforcement learning in control of water systems. Journal of Water Resources Planning and Management, 129 (6), 458-465.

    Bosoniu, L., Babuska, R., & Schutter, B. D. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, 38 (2), 156-172.

    Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1-94.

  • Metrics
    No metrics available
Share - Bookmark