Follow
Adam Gleave
Adam Gleave
CEO at FAR AI
Verified email at far.ai - Homepage
Title
Cited by
Cited by
Year
Stable-baselines3: Reliable reinforcement learning implementations
A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann
Journal of Machine Learning Research 22 (268), 1-8, 2021
16492021
Stable baselines
A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...
8712018
Adversarial policies: Attacking deep reinforcement learning
A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell
International Conference on Learning Representations, 2020
3872020
Firmament: Fast, centralized cluster scheduling at scale
I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand
12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016
2732016
Inverse reinforcement learning for video games
A Tucker, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2018
532018
Quantifying differences in reward functions
A Gleave, M Dennis, S Legg, S Russell, J Leike
International Conference on Learning Representations, 2021
522021
imitation: Clean imitation learning implementations
A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ...
arXiv preprint arXiv:2211.11972, 2022
50*2022
Multi-task maximum entropy inverse reinforcement learning
A Gleave, O Habryka
GoalsRL Workshop at ICML, 2018
412018
Adversarial Policies Beat Superhuman Go AIs
TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ...
arXiv preprint arXiv:2211.00241, 2022
37*2022
Active inverse reward design
S Mindermann, R Shah, A Gleave, D Hadfield-Menell
GoalsRL Workshop at ICML, 2018
282018
Understanding learned reward functions
EJ Michaud, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
252020
Invariance in policy optimisation and partial identifiability in reward learning
JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave
International Conference on Machine Learning, 32033-32058, 2023
242023
Uncertainty estimation for language reward models
A Gleave, G Irving
arXiv preprint arXiv:2203.07472, 2022
212022
A primer on maximum causal entropy inverse reinforcement learning
A Gleave, S Toyer
arXiv preprint arXiv:2203.11409, 2022
182022
Making compression algorithms for Unicode text
A Gleave, C Steinruecken
Data Compression Conference, 2017
162017
On the fragility of learned reward functions
L McKinney, Y Duan, D Krueger, A Gleave
arXiv preprint arXiv:2301.03652, 2023
102023
Preprocessing reward functions for interpretability
E Jenner, A Gleave
arXiv preprint arXiv:2203.13553, 2022
82022
DERAIL: Diagnostic Environments for Reward And Imitation Learning
P Freire, A Gleave, S Toyer, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
82020
Exploiting novel gpt-4 apis
K Pelrine, M Taufeeque, M Zając, E McLean, A Gleave
arXiv preprint arXiv:2312.14302, 2023
72023
Reducing exploitability with population based training
P Czempin, A Gleave
arXiv preprint arXiv:2208.05083, 2022
52022
The system can't perform the operation now. Try again later.
Articles 1–20