Follow
Silviu Pitis
Silviu Pitis
University of Toronto, Vector Institute
Verified email at cs.toronto.edu - Homepage
Title
Cited by
Cited by
Year
Large language models are human-level prompt engineers
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
International Conference on Learning Representations (ICLR 2023), 2023
4252023
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
S Pitis, H Chan, S Zhao, B Stadie, J Ba
International Conference on Machine Learning (ICML 2020), 2020
1112020
Counterfactual data augmentation using locally factored dynamics
S Pitis, E Creager, A Garg
Neural Information Processing Systems (NeurIPS 2020), 2020
702020
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
S Pitis
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019
472019
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
K De Asis, A Chan, S Pitis, RS Sutton, D Graves
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020
302020
Boosted prompt ensembles for large language models
S Pitis, MR Zhang, A Wang, J Ba
arXiv preprint arXiv:2304.05970, 2023
242023
An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
S Pitis, H Chan, K Jamali, J Ba
Eighth International Conference on Learning Representations (ICLR 2020), 2020
212020
Source Traces for Temporal Difference Learning
S Pitis
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018
192018
MoCoDA: Model-based Counterfactual Data Augmentation
S Pitis, E Creager, A Mandlekar, A Garg
Neural Information Processing Systems (NeurIPS 2022), 2022
182022
Large language models are human-level prompt engineers (2022)
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
arXiv preprint arXiv:2211.01910, 2022
162022
Identifying the risks of lm agents with an lm-emulated sandbox
Y Ruan, H Dong, A Wang, S Pitis, Y Zhou, J Ba, Y Dubois, CJ Maddison, ...
arXiv preprint arXiv:2309.15817, 2023
152023
Failure modes of learning reward models for llms and other sequence models
S Pitis
ICML 2023 Workshop The Many Facets of Preference-Based Learning, 2023
62023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
S Pitis
Neural Information Processing Systems (NeurIPS 2023), 2023
5*2023
Calibrating language models via augmented prompt ensembles
M Jiang, Y Ruan, S Huang, S Liao, S Pitis, RB Grosse, J Ba
42023
Return augmentation gives supervised RL temporal compositionality
K Paster, S Pitis, SA McIlraith, J Ba
Deep Reinforcement Learning Workshop NeurIPS 2022, 2022
42022
Steering large language models using APE
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
NeurIPS ML Safety Workshop, 2022
32022
Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
S Pitis, MR Zhang
International Conference on Autonomous Agents and Multi-Agent Systems 2020, 2020
32020
ProtoGE: Prototype Goal Encodings for Multi-goal Reinforcement Learning
S Pitis, H Chan, J Ba
The 4th Multidisciplinary Conference on Reinforcement Learning and Decision …, 2019
32019
Methods for retrieving alternative contract language using a prototype
S Pitis
The Sixteenth International Conference on Law and Artificial Intelligence …, 2017
32017
CSC 311: Introduction to machine learning
R Grosse, C Maddison, J Bae, S Pitis
University of Toronto, Fall, 2020
22020
The system can't perform the operation now. Try again later.
Articles 1–20