Follow
Yash Chandak
Yash Chandak
Postdoctoral Scholar, Stanford University
Verified email at stanford.edu - Homepage
Title
Cited by
Cited by
Year
Learning action representations for reinforcement learning
Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas
International conference on machine learning, 941-950, 2019
1812019
Optimizing for the future in non-stationary mdps
Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ...
International Conference on Machine Learning, 1414-1425, 2020
632020
Evaluating the performance of reinforcement learning algorithms
S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas
International Conference on Machine Learning, 4962-4973, 2020
602020
Universal off-policy evaluation
Y Chandak, S Niekum, B da Silva, E Learned-Miller, E Brunskill, ...
Advances in Neural Information Processing Systems 34, 27475-27490, 2021
482021
Lifelong learning with a changing action set
Y Chandak, G Theocharous, C Nota, P Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3373-3380, 2020
312020
Towards safe policy improvement for non-stationary MDPs
Y Chandak, S Jordan, G Theocharous, M White, PS Thomas
Advances in Neural Information Processing Systems 33, 9156-9168, 2020
262020
Understanding self-predictive learning for reinforcement learning
Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ...
International Conference on Machine Learning, 33632-33656, 2023
212023
Supervised pretraining can learn in-context reinforcement learning
J Lee, A Xie, A Pacchiano, Y Chandak, C Finn, O Nachum, E Brunskill
Advances in Neural Information Processing Systems 36, 2024
192024
Reinforcement learning for strategic recommendations
G Theocharous, Y Chandak, PS Thomas, F de Nijs
arXiv preprint arXiv:2009.07346, 2020
122020
Fusion graph convolutional networks
P Vijayan, Y Chandak, MM Khapra, S Parthasarathy, B Ravindran
arXiv preprint arXiv:1805.12528, 2018
112018
Reinforcement learning when all actions are not always available
Y Chandak, G Theocharous, B Metevier, P Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3381-3388, 2020
82020
Sope: Spectrum of off-policy estimators
C Yuan, Y Chandak, S Giguere, PS Thomas, S Niekum
Advances in Neural Information Processing Systems 34, 18958-18969, 2021
62021
High-confidence off-policy (or counterfactual) variance estimation
Y Chandak, S Shankar, PS Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 35 (8), 6939-6947, 2021
62021
Factored DRO: Factored distributionally robust policies for contextual bandits
T Mu, Y Chandak, TB Hashimoto, E Brunskill
Advances in Neural Information Processing Systems 35, 8318-8331, 2022
52022
Off-policy evaluation for action-dependent non-stationary environments
Y Chandak, S Shankar, N Bastian, B da Silva, E Brunskill, PS Thomas
Advances in Neural Information Processing Systems 35, 9217-9232, 2022
42022
HOPF: higher order propagation framework for deep collective classification
P Vijayan, Y Chandak, MM Khapra, S Parthasarathy, B Ravindran
arXiv preprint arXiv:1805.12421, 2018
42018
High confidence generalization for reinforcement learning
J Kostas, Y Chandak, SM Jordan, G Theocharous, P Thomas
International Conference on Machine Learning, 5764-5773, 2021
32021
Generating and providing proposed digital actions in high-dimensional action spaces using reinforcement learning models
Y Chandak, G Theocharous
US Patent App. 16/261,092, 2020
32020
On optimizing human-machine task assignments
A Veit, M Wilber, R Vaish, S Belongie, J Davis, V Anand, A Aviral, ...
arXiv preprint arXiv:1509.07543, 2015
32015
On optimizing interventions in shared autonomy
W Tan, D Koleczek, S Pradhan, N Perello, V Chettiar, V Rohra, A Rajaram, ...
Proceedings of the AAAI Conference on Artificial Intelligence 36 (5), 5341-5349, 2022
22022
The system can't perform the operation now. Try again later.
Articles 1–20