Solenne GAUCHER (Orsay) – "Introduction to stochastic bandits"

March 17 @ 2:00 pm - 3:00 pm | Organizers: , François-Pierre Paty, Nicolas Schreuder
Statistics-Econometrics-Machine Learning Seminar.
Time: 14:00 pm – 15:00 pm
Date: 17th of March 2021
Place: Online
Abstract :
The stochastic multi-armed bandit is used to model the following problem : at each time step, an agent must choose an action from a finite set, and receives a reward drawn i.i.d. from a distribution depending on the action she has selected. Her aim is to maximise her cumulative reward. The agent then faces a trade-off between collecting information on the mechanism generating the rewards, and taking the best action with regard to the information collected, so as to maximise her immediate reward.
In this talk, I will present classical results in the stochastic multi-armed bandit setting. Next, I will show how these results can be extended to the continuum-armed bandits framework, where the expected reward for taking an action is modeled as a function of a covariate describing this action.