Follow
Michel Ma
Michel Ma
PhD candidate, University of Montreal, Mila
Verified email at mila.quebec
Title
Cited by
Cited by
Year
When do transformers shine in rl? decoupling memory from credit assignment
T Ni, M Ma, B Eysenbach, PL Bacon
Advances in Neural Information Processing Systems 36, 2024
72024
Long-term credit assignment via model-based temporal shortcuts
M Ma, P D'Oro, Y Bengio, PL Bacon
Deep RL Workshop NeurIPS 2021, 2021
52021
Counterfactual Policy Evaluation and the Conditional Monte Carlo Method
M Ma, B Pierre-Luc
Offline Reinforcement Learning Workshop, NeurIPS, 2020
12020
Do Transformer World Models Give Better Policy Gradients?
M Ma, T Ni, C Gehring, P D'Oro, PL Bacon
arXiv preprint arXiv:2402.05290, 2024
2024
Bridging State and History Representations: Understanding Self-Predictive RL
T Ni, B Eysenbach, E Seyedsalehi, M Ma, C Gehring, A Mahajan, ...
arXiv preprint arXiv:2401.08898, 2024
2024
A Differentiable Sequence Model Perspective on Policy Gradients
M Ma, P D'Oro, T Ni, C Gehring, PL Bacon
2023
Parsimonious reasoning in reinforcement learning for better credit assignment
M Ma
2022
The system can't perform the operation now. Try again later.
Articles 1–7