Learning equilibrium mean‐variance strategy

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Jan 2020 English Publisher:WileyJournal:Mathematical Finance, volume 33, pages 1,166-1,212 (issn: 0960-1627, eissn: 1467-9965,

Copyright policy )

Authors: Min Dai; Yuchao Dong; Yanwei Jia;

doi: 10.1111/mafi.12402 , 10.2139/ssrn.3770818

Learning equilibrium mean‐variance strategy

- Summary
- Subjects
- Metrics

Abstract

AbstractWe study a dynamic mean‐variance portfolio optimization problem under the reinforcement learning framework, where an entropy regularizer is introduced to induce exploration. Due to the time–inconsistency involved in a mean‐variance criterion, we aim to learn an equilibrium policy. Under an incomplete market setting, we obtain a semi‐analytical, exploratory, equilibrium mean‐variance policy that turns out to follow a Gaussian distribution. We then focus on a Gaussian mean return model and propose a reinforcement learning algorithm to find the equilibrium policy. Thanks to a thoroughly designed policy iteration procedure in our algorithm, we prove the convergence of our algorithm under mild conditions, despite that dynamic programming principle and the usual policy improvement theorem failing to hold for an equilibrium policy. Numerical experiments are given to demonstrate our algorithm. The design and implementation of our reinforcement learning algorithm apply to a general market setup.

Related Organizations

Chinese University of Hong Kong
China (People's Republic of)
National University of Singapore
Singapore
New York University
United States
Tongji University
China (People's Republic of)
Hong Kong Polytechnic University
China (People's Republic of)

View all View all

Keywords

entropy regularized exploration-exploitation, reinforcement learning, Portfolio theory, Learning and adaptive systems in artificial intelligence, equilibrium mean variance analysis, asset allocation

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	24
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%