Follow
Michael Gimelfarb
Michael Gimelfarb
Computer Science, University of Toronto
Verified email at mail.utoronto.ca
Title
Cited by
Cited by
Year
Reinforcement learning with multiple experts: A bayesian model combination approach
M Gimelfarb, S Sanner, CG Lee
Advances in Neural Information Processing Systems (NeurIPS) 31, 9528-9538, 2018
292018
ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning
M Gimelfarb, S Sanner, CG Lee
Uncertainty in Artificial Intelligence (UAI-19) 35, 476-485, 2019
252019
Risk-Aware Transfer in Reinforcement Learning using Successor Features
M Gimelfarb, A Barreto, S Sanner, CG Lee
Advances in Neural Information Processing Systems (NeurIPS) 34, 2021
162021
pyrddlgym: From rddl to gym environments
A Taitler, M Gimelfarb, J Jeong, S Gopalakrishnan, M Mladenov, X Liu, ...
arXiv preprint arXiv:2211.05939, 2022
72022
Contextual policy transfer in reinforcement learning domains via deep mixtures-of-experts
M Gimelfarb, S Sanner, CG Lee
Uncertainty in Artificial Intelligence (UAI-21) 37, 1787-1797, 2021
6*2021
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
J Jeong, X Wang, M Gimelfarb, H Kim, B Abdulhai, S Sanner
International Conference on Learning Representations (ICLR), 2023
42023
A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs
N Patton, J Jeong, M Gimelfarb, S Sanner
AAAI Conference on Artificial Intelligence (AAAI) 6 (9), 9894-9901, 2022
4*2022
Bayesian Experience Reuse for Learning from Multiple Demonstrators
M Gimelfarb, S Sanner, CG Lee
International Joint Conference on Artificial Intelligence (IJCAI) 30, 2021
22021
Constraint-Generation Policy Optimization (CGPO): Nonlinear Programming for Policy Optimization in Mixed Discrete-Continuous MDPs
M Gimelfarb, A Taitler, S Sanner
arXiv preprint arXiv:2401.12243, 2024
2024
The 2023 International Planning Competition
A Taitler, R Alford, J Espasa, G Behnke, D Fišer, M Gimelfarb, ...
AI Magazine, 2024
2024
Thompson Sampling for Parameterized Markov Decision Processes with Uninformative Actions
M Gimelfarb, MJ Kim
arXiv preprint arXiv:2305.07844, 2023
2023
Who Should I Trust?: Uncertainty and Risk for Knowledge Transfer from Multiple Sources in Reinforcement Learning Domains
M Gimelfarb
University of Toronto (Canada), 2023
2023
Distributional Reward Shaping: Point Estimates Are All You Need
M Gimelfarb, S Sanner, CG Lee
The Multi-disciplinary Conference on Reinforcement Learning and Decision …, 2022
2022
End-to-End Risk-Aware Planning by Gradient Descent
N Patton, J Jeong, M Gimelfarb, S Sanner
PRL Workshop – Bridging the Gap Between AI Planning and Reinforcement Learning, 2021
2021
Thompson Sampling for the Control of a Queue with Demand Uncertainty
M Gimelfarb
University of Toronto (Canada), 2017
2017
JaxPlan and GurobiPlan: Optimization Baselines for Replanning in Discrete and Mixed Discrete and Continuous Probabilistic Domains
M Gimelfarb, A Taitler, S Sanner
34th International Conference on Automated Planning and Scheduling, 0
The system can't perform the operation now. Try again later.
Articles 1–16