Deep fictitious play for stochastic differential games

Name: Deep fictitious play for stochastic differential games
Creator: Ruimeng Hu
Keywords: FOS: Computer and information sciences, Computer Science - Computer Science and Game Theory, Statistics - Machine Learning, Optimization and Control (math.OC), 0502 economics and business, 05 social sciences, FOS: Mathematics, Machine Learning (stat.ML), 0101 mathematics, Mathematics - Optimization and Control

Ruimeng Hu

Found an issue? Give us feedback

Communications in Ma...arrow_drop_down

Communications in Mathematical Sciences

Article

Data sources: UnpayWall

arXiv.org e-Print Archive

Preprint . 2019

Data sources: arXiv.org e-Print Archive

Communications in Mathematical Sciences

Article . 2021 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2019

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

DBLP

Article . 2019

Data sources: DBLP

https://dx.doi.org/10.4310/cms...

Article

Data sources: Microsoft Academic Graph

Deep fictitious play for stochastic differential games

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2021Embargo end date: 01 Jan 2019 English Publisher:International Press of BostonJournal:Communications in Mathematical Sciences, volume 19, pages 325-353 (issn: 1539-6746, eissn: 1945-0796,

Copyright policy )Funded by:NSF | Collaborative Research: M...

Authors: Ruimeng Hu;

doi: 10.4310/cms.2021.v19.n2.a2 , 10.48550/arxiv.1903.09376

arXiv: 1903.09376

Deep fictitious play for stochastic differential games

- Summary
- Subjects
- Metrics

Abstract

In this paper, we apply the idea of fictitious play to design deep neural networks (DNNs), and develop deep learning theory and algorithms for computing the Nash equilibrium of asymmetric $N$-player non-zero-sum stochastic differential games, for which we refer as \emph{deep fictitious play}, a multi-stage learning process. Specifically at each stage, we propose the strategy of letting individual player optimize her own payoff subject to the other players' previous actions, equivalent to solve $N$ decoupled stochastic control optimization problems, which are approximated by DNNs. Therefore, the fictitious play strategy leads to a structure consisting of $N$ DNNs, which only communicate at the end of each stage. The resulted deep learning algorithm based on fictitious play is scalable, parallel and model-free, {\it i.e.}, using GPU parallelization, it can be applied to any $N$-player stochastic differential game with different symmetries and heterogeneities ({\it e.g.}, existence of major players). We illustrate the performance of the deep learning algorithm by comparing to the closed-form solution of the linear quadratic game. Moreover, we prove the convergence of fictitious play under appropriate assumptions, and verify that the convergent limit forms an open-loop Nash equilibrium. We also discuss the extensions to other strategies designed upon fictitious play and closed-loop Nash equilibrium in the end.

Related Organizations

University of California, Santa Barbara
United States
Columbia University
United States
King’s University
United States

Keywords

FOS: Computer and information sciences, Computer Science - Computer Science and Game Theory, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, 91A15, 91B50, 91A26, 68T20, 60G99, Computer Science and Game Theory (cs.GT)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	10
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%