IMED-RL: Regret optimal learning of ergodic Markov decision processes F Pesquerel, OA Maillard Advances in Neural Information Processing Systems 35, 26363-26374, 2022 | 10 | 2022 |
Stochastic bandits with groups of similar arms. F Pesquerel, H Saber, OA Maillard Advances in Neural Information Processing Systems 34, 19461-19472, 2021 | 5 | 2021 |
Logarithmic regret in communicating MDPs: Leveraging known dynamics with bandits H Saber, F Pesquerel, OA Maillard, MS Talebi Asian Conference on Machine Learning, 2023 | | 2023 |
Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits D Baudry, F Pesquerel, R Degenne, OA Maillard Thirty-seventh Conference on Neural Information Processing Systems, 2023 | | 2023 |