publication . Article . Other literature type . 2002

The nonstochastic multiarmed bandit problem

Auer, P.; Nicolò Antonio Cesa-Bianchi; Freund, Y.; Schapire, Re;
  • Published: 01 Jan 2002
Abstract
In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff). Past solutions for the bandit problem have almost always relied on assumptions about the statistics of the slot machines. In this work, we make no statistical assumptions whatsoever about the nature of the process generating the payoffs of the slot machines. We give a s...
Subjects
arXiv: Computer Science::Computer Science and Game Theory
free text keywords: Stochastic game, Mathematical economics, Statistical assumption, Almost surely, Upper and lower bounds, Mathematical optimization, Game theory, Combinatorics, Mathematics, Stochastic process, Minimax, Multi-armed bandit
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Article . Other literature type . 2002

The nonstochastic multiarmed bandit problem

Auer, P.; Nicolò Antonio Cesa-Bianchi; Freund, Y.; Schapire, Re;