Daniel Paleka

Cited by

	All	Since 2019
Citations	329	329
h-index	7	7
i10-index	6	6

220

110

165

2022202320242 111 216

Co-authors

Florian TramèrAssistant Professor of Computer Science, ETH ZurichVerified email at inf.ethz.ch
Nicholas CarliniGoogle DeepMindVerified email at google.com
Javier RandoETH ZurichVerified email at ai.ethz.ch
Lennart HeimCentre for the Governance of AIVerified email at governance.ai
David LindnerGoogle DeepMindVerified email at google.com
Lukas FluriMaster graduate, ETH ZürichVerified email at ethz.ch
Amartya SanyalMax Planck Institute for Intelligent Systems, TuebingenVerified email at tuebingen.mpg.de

Daniel Paleka

ETH Zurich

Verified email at inf.ethz.ch

Machine Learning ML Security AI Safety


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Poisoning Web-Scale Training Datasets is Practical N Carlini, M Jagielski, CA Choquette-Choo, D Paleka, W Pearce, ... arXiv preprint arXiv:2302.10149, 2023	115	2023
Red-Teaming the Stable Diffusion Safety Filter J Rando, D Paleka, D Lindner, L Heim, F Tramèr arXiv preprint arXiv:2210.04610, 2022	93	2022
ARB: Advanced Reasoning Benchmark for Large Language Models T Sawada, D Paleka, A Havrilla, P Tadepalli, P Vidas, A Kranias, JJ Nay, ... arXiv preprint arXiv:2307.13692, 2023	37	2023
Foundational Challenges in Assuring Alignment and Safety of Large Language Models U Anwar, A Saparov, J Rando, D Paleka, M Turpin, P Hase, ES Lubana, ... arXiv preprint arXiv:2404.09932, 2024	34	2024
Stealing Part of a Production Language Model N Carlini, D Paleka, KD Dvijotham, T Steinke, J Hayase, AF Cooper, ... arXiv preprint arXiv:2403.06634, 2024	21	2024
Evaluating Superhuman Models with Consistency Checks L Fluri, D Paleka, F Tramèr arXiv preprint arXiv:2306.09983, 2023	16	2023
A law of adversarial risk, interpolation, and label noise D Paleka, A Sanyal arXiv preprint arXiv:2207.03933, 2022	8	2022
Refusal in Language Models Is Mediated by a Single Direction A Arditi, O Obeso, A Syed, D Paleka, N Rimsky, W Gurnee, N Nanda arXiv preprint arXiv:2406.11717, 2024	4	2024
Injectivity of ReLU neural networks at initialization D Paleka ETH Zurich, 2021	1	2021
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition E Debenedetti, J Rando, D Paleka*, FF Silaghi, D Albastroiu, N Cohen, ... arXiv e-prints, arXiv: 2406.07954, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors