An inference-based policy gradient method for learning options M Smith, H Hoof, J Pineau International Conference on Machine Learning, 4703-4712, 2018 | 39 | 2018 |
A Strong Baseline for Batch Imitation Learning M Smith, L Maystre, Z Dai, K Ciosek arXiv preprint arXiv:2302.02788, 2023 | 5 | 2023 |
Why Target Networks Stabilise Temporal Difference Methods M Fellows, MJA Smith, S Whiteson arXiv preprint arXiv:2302.12537, 2023 | 3 | 2023 |
Learning Skills Diverse in Value-Relevant Features MJA Smith, J Luketina, K Hartikainen, M Igl, S Whiteson Conference on Lifelong Learning Agents, 1174-1194, 2022 | 1 | 2022 |
A Sparse Probabilistic Model of User Preference Data M Smith, L Charlin, J Pineau Advances in Artificial Intelligence: 30th Canadian Conference on Artificial …, 2017 | | 2017 |