Emilie Kaufmann (CNRS, CRIStAL, Université de Lille)

Emilie Kaufmann (CNRS, CRIStAL, Université de Lille)

Solving pure exploration tasks in bandit models

Quand

13 juin 2025    
11h00 - 12h00

Salle du Conseil, Espace Turing
45 rue des Saints-Pères, Paris, 75006

Type d’évènement

Bandit models are well-studied in the machine learning community due to numerous applications to online content optimization. In a multi-armed bandit model, an agent sequentially tries different actions, called arms. Each arm is associated with an unknown probability distribution. In a pure exploration task, the agent wants to learn something about these distributions (e.g., which arms has the largest expectation, in the best identification task) by querying as few samples as possible from them. After presenting a lower bound on the number of samples needed by any algorithm that solves the task with a given error probability, I will present some algorithms matching this lower bound, at least when the error probability is small. The first algorithm, Track and Stop, is directly inspired by the lower bound but can have a high computational cost. To mitigate it in the particular case of the best arm identification task, I will then advocate the use of a family of algorithms called “Top Two” algorithms.

Vous aimerez aussi...