Marc JOURDAN (EPFL) – Advances in Pure Exploration in Bandits: Non-Asymptotic, Private, and Structured

Name: Marc JOURDAN (EPFL) – Advances in Pure Exploration in Bandits: Non-Asymptotic, Private, and Structured
Start: 2025-11-10T14:00:00+02:00
End: 2025-11-10T15:30:00+02:00

November 10 @ 2:00 pm - 3:30 pm

Statistical Seminar: Every Monday at 2:00 pm.
Time: 2:00 pm – 3:00 pm
Date: November 10th
Place: 3001

Abstract:

In pure exploration problems for stochastic multi-armed bandits, the goal is to answer a question about a set of unknown distributions (for example, the efficacy of a treatment) by strategically sampling from them, while providing guarantees on the returned answer. The archetypal example is the best arm identification problem, where the task is to find the arm with the largest mean. Top Two algorithms, which select the next arm to sample from among a leader and a challenger, have received significant attention in recent years due to their simplicity and interpretability.

In this talk, I will present recent advances on three complementary aspects of pure exploration: achieving non-asymptotic guarantees, ensuring differential privacy, and leveraging problem structure. First, we propose a Top Two algorithm which has an asymptotically optimal expected sample complexity, and also provides anytime guarantees on the probability of misidentifying a sufficiently good arm. Second, we show how the Top Two principle can be combined with differential privacy mechanisms, leading to algorithms that preserve near-optimal efficiency while ensuring privacy guarantees. Finally, we address structured pure exploration by overcoming the computational bottleneck of Pareto set identification in linear bandits, through a game-based algorithm grounded in posterior sampling. These results not only deepen our theoretical understanding but also enable more practical and privacy-aware bandit algorithms for real-world problems.

Bio:

I am a postdoctoral researcher at EPFL in the Theory of Machine Learning lab, working with Nicolas Flammarion on the theoretical foundations of post-training for Large Language Models. I earned my PhD in Computer Science from the University of Lille, under the supervision of Emilie Kaufmann and Rémy Degenne within the Inria Scool team. I studied pure exploration in stochastic bandits and helped establish the Top Two approach as a principled methodology with strong theoretical and empirical performance. Previously, I graduated from École Polytechnique and ETH Zurich.

https://marcjourdan.github.io/

Organizers:

Anna KORBA (CREST), Karim LOUNICI (CMAP) , Jaouad MOURTADA (CREST)

Sponsors:
CREST-CMAP

+ Google Calendar + iCal Export

Latest News

Announcements

Marc JOURDAN (EPFL) – Advances in Pure Exploration in Bandits: Non-Asymptotic, Private, and Structured