‪Nathaniel Li‬ - ‪Google Scholar‬

Get my own profile

Cited by

	All	Since 2019
Citations	286	286
h-index	4	4
i10-index	4	4

0

240

120

60

180

2022202320241 54 228

Public access

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Andy ZouPhD Student, Carnegie Mellon UniversityVerified email at andrew.cmu.edu
Dan HendrycksDirector of the Center for AI SafetyVerified email at berkeley.edu
Steven BasartPhD, University of ChicagoVerified email at ttic.edu
Alexander PanUC BerkeleyVerified email at berkeley.edu
Mantas MazeikaUniversity of Illinois Urbana-ChampaignVerified email at illinois.edu
Zifan WangCarnegie Mellon UniversityVerified email at andrew.cmu.edu

Nathaniel Li

Nathaniel Li

Verified email at berkeley.edu - Homepage

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Representation Engineering: A Top-Down Approach to AI Transparency A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... arXiv preprint arXiv:2310.01405, 2023	133	2023
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark A Pan, JS Chan, A Zou, N Li, S Basart, T Woodside, H Zhang, S Emmons, ... ICML 2023, 26837-26867, 2023	90	2023
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal M Mazeika, L Phan, X Yin, A Zou, Z Wang, N Mu, E Sakhaee, N Li, ... ICML 2024, 2024	39	2024
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ... ICML 2024, 2024	24	2024

The system can't perform the operation now. Try again later.

Articles 1–4