Follow
Esther Derman
Esther Derman
Mila - Quebec AI Institute
Verified email at mila.quebec
Title
Cited by
Cited by
Year
Soft-Robust Actor-Critic Policy-Gradient
E Derman, DJ Mankowitz, TA Mann, S Mannor
AUAI press for Association for Uncertainty in Artificial Intelligence, 208-218, 2018
632018
A bayesian approach to robust reinforcement learning
E Derman, D Mankowitz, T Mann, S Mannor
Uncertainty in Artificial Intelligence, 648-658, 2020
522020
Distributional Robustness and Regularization in Reinforcement Learning
E Derman, S Mannor
ICML Workshop on Theoretical Foundations of Reinforcement Learning, 2020
432020
Twice regularized MDPs and the equivalence between robustness and regularization
E Derman, M Geist, S Mannor
Advances in Neural Information Processing Systems 34, 22274-22287, 2021
412021
Acting in Delayed Environments with Non-Stationary Markov Policies
E Derman, G Dalal, S Mannor
International Conference on Learning Representations (ICLR), 2021
272021
Policy Gradient for Rectangular Robust Markov Decision Processes
N Kumar, E Derman, M Geist, K Levy, S Mannor
Advances in Neural Information Processing Systems 36, 2024
222024
Clustering and model selection via penalized likelihood for different-sized categorical data vectors
E Derman, EL Pennec
arXiv preprint arXiv:1709.02294, 2017
32017
Solving non-rectangular reward-robust MDPs via frequency regularization
U Gadot, E Derman, N Kumar, MM Elfatihi, K Levy, S Mannor
Proceedings of the AAAI Conference on Artificial Intelligence 38 (19), 21090 …, 2024
12024
Twice regularized Markov decision processes: The equivalence between robustness and regularization
E Derman, Y Men, M Geist, S Mannor
arXiv preprint arXiv:2303.06654, 2023
12023
Tree Search-Based Policy Optimization under Stochastic Execution Delay
D Valensi, E Derman, S Mannor, G Dalal
The Twelfth International Conference on Learning Representations, 2023
2023
Targeted Uncertainty Reduction in Robust MDPs
U Gadot, K Wang, E Derman, N Kumar, K Levy, S Mannor
NeurIPS 2023 Workshop on Generalization in Planning, 0
The system can't perform the operation now. Try again later.
Articles 1–11