Meta-Learning in Self-Play Regret Minimization

Name: Meta-Learning in Self-Play Regret Minimization
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory, Computer Science and Game Theory (cs.GT), Machine Learning (cs.LG)

Sychrovský, David; Schmid, Martin; Šustr, Michal; Bowling, Michael

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2025

Data sources: arXiv.org e-Print Archive

https://dx.doi.org/10.48550/ar...

Article . 2025

License: CC BY

Data sources: Datacite

Meta-Learning in Self-Play Regret Minimization

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2025Embargo end date: 01 Jan 2025Publisher:arXiv

Authors: Sychrovský, David; Schmid, Martin; Šustr, Michal; Bowling, Michael;

doi: 10.48550/arxiv.2504.18917

arXiv: 2504.18917

Meta-Learning in Self-Play Regret Minimization

- Summary
- Subjects
- Metrics

Abstract

Regret minimization is a general approach to online optimization which plays a crucial role in many algorithms for approximating Nash equilibria in two-player zero-sum games. The literature mainly focuses on solving individual games in isolation. However, in practice, players often encounter a distribution of similar but distinct games. For example, when trading correlated assets on the stock market, or when refining the strategy in subgames of a much larger game. Recently, offline meta-learning was used to accelerate one-sided equilibrium finding on such distributions. We build upon this, extending the framework to the more challenging self-play setting, which is the basis for most state-of-the-art equilibrium approximation algorithms for domains at scale. When selecting the strategy, our method uniquely integrates information across all decision states, promoting global communication as opposed to the traditional local regret decomposition. Empirical evaluation on normal-form games and river poker subgames shows our meta-learned algorithms considerably outperform other state-of-the-art regret minimization algorithms.

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory, Computer Science and Game Theory (cs.GT), Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green