‪Andy L. Jones‬ - ‪Google Scholar‬

Get my own profile

Cited by

	All	Since 2019
Citations	3409	3405
h-index	11	11
i10-index	11	11

0

1900

950

475

1425

20222023202499 1403 1889

Andy L. Jones

Andy L. Jones

Anthropic

Verified email at andyljones.com - Homepage

Machine learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Training a helpful and harmless assistant with reinforcement learning from human feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ... arXiv preprint arXiv:2204.05862, 2022	985	2022
Constitutional ai: Harmlessness from ai feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	778	2022
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	300	2022
A general language assistant as a laboratory for alignment A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ... arXiv preprint arXiv:2112.00861, 2021	283	2021
In-context learning and induction heads C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ... arXiv preprint arXiv:2209.11895, 2022	255*	2022
A mathematical framework for transformer circuits N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Transformer Circuits Thread 1 (1), 12, 2021	239*	2021
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ... Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022	221	2022
Discovering language model behaviors with model-written evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022	162	2022
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	106	2022
Measuring progress on scalable oversight for large language models SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukošiūtė, ... arXiv preprint arXiv:2211.03540, 2022	58	2022
Scaling scaling laws with board games AL Jones arXiv preprint arXiv:2104.03113, 2021	21	2021
Segmenting microarrays with deep neural networks A Jones bioRxiv, 020404, 2015	1	2015
Nova MUSCAE 1998 A Jones, A Pearce, F Farrell International Astronomical Union Circular 7080, 2, 1999		1999
V3890 Sagittarii A Jones, A Pearce International Astronomical Union Circular 5004, 3, 1990		1990

The system can't perform the operation now. Try again later.

Articles 1–14