Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Data Mining and Know...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Data Mining and Knowledge Discovery
Article . 2024 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
zbMATH Open
Article . 2025
Data sources: zbMATH Open
DBLP
Article
Data sources: DBLP
versions View all 4 versions
addClaim

Thompson sampling-based recursive block elimination for dynamic assignment under limited budget in pure-exploration

Authors: Shameem Puthiya Parambath; Christos Anagnostopoulos 0001; Saleh Abdullah M. Alfahad;

Thompson sampling-based recursive block elimination for dynamic assignment under limited budget in pure-exploration

Abstract

Abstract In this paper, we investigate Thompson sampling-based sequential block elimination approaches for dynamic assignment problems in a pure-exploration Multi-Armed Bandit (MAB) setting with limited budget constraints. The problem can be considered as a bandit game-play between the environment and a decision-maker in a metric space. Many instances of problems in fields such as e-commerce, logistics, mobility management, data management and operations research can be framed as dynamic assignment problems with budget constraints. Given an l-dimensional action space representing l variants of an entity and a budget for exploring the action space, the optimal dynamic assignment problem refers to the task of identifying the values to be assigned to different variants of the entity that maximizes the total reward by utilizing at most the given budget of rounds of play. We contribute a class of block elimination-based MAB algorithms specifically designed for the dynamic assignment problem with limited budget. Our algorithms begin by discretizing the continuous action space into a finite set of discrete actions, then proceed with a recursive block elimination procedure to remove sub-optimal actions. The elimination is carried out by calculating confidence bounds over blocks of actions. We explore two different confidence bound estimation techniques. We perform comprehensive experiments on two problem instances from distributed data management and logistics. Our results showcase that our approach yields a lower misidentification probability (i.e., the probability of recommending a non-optimal action) compared to state-of-the-art elimination-based pure-exploration MAB algorithms.

Related Organizations
Keywords

learning with feedback, dynamic assignment, Sampling theory, sample surveys, multi-armed bandits, dynamic pricing, pure-exploration, sequential learning, Thompson sampling

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average
hybrid