A Subgame Perfect Equilibrium Reinforcement Learning Approach to Time-inconsistent Problems

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2021Embargo end date: 01 Jan 2021 English Publisher:Elsevier BVJournal:SSRN Electronic Journal (eissn: 1556-5068,

Copyright policy )

Authors: Nixie S. Lesmana; Chi Seng Pun;

doi: 10.2139/ssrn.3951936 , 10.1137/23m1594510 , 10.48550/arxiv.2110.14295

arXiv: 2110.14295

A Subgame Perfect Equilibrium Reinforcement Learning Approach to Time-inconsistent Problems

- Summary
- Subjects
- Metrics

Abstract

In this paper, we establish a subgame perfect equilibrium reinforcement learning (SPERL) framework for time-inconsistent (TIC) problems. In the context of RL, TIC problems are known to face two main challenges: the non-existence of natural recursive relationships between value functions at different time points and the violation of Bellman's principle of optimality that raises questions on the applicability of standard policy iteration algorithms for unprovable policy improvement theorems. We adapt an extended dynamic programming theory and propose a new class of algorithms, called backward policy iteration (BPI), that solves SPERL and addresses both challenges. To demonstrate the practical usage of BPI as a training framework, we adapt standard RL simulation methods and derive two BPI-based training algorithms. We examine our derived training frameworks on a mean-variance portfolio selection problem and evaluate some performance metrics including convergence and model identifiability.

Related Organizations

Nanyang Technological University
Singapore

Keywords

FOS: Computer and information sciences, reinforcement learning, mean-variance analysis, Computer Science - Machine Learning, time inconsistency, consistent planning, subgame perfect equilibrium, Systems and Control (eess.SY), Electrical Engineering and Systems Science - Systems and Control, Machine Learning (cs.LG), intrapersonal game, Portfolio theory, Computer Science - Computer Science and Game Theory, Optimization and Control (math.OC), Applications of game theory, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Mathematics, Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.), training algorithms, Mathematics - Optimization and Control, Computer Science and Game Theory (cs.GT)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Green

bronze

Fields of Science

social sciences

economics and business

Fields of Science

social sciences

economics and business