Alexander Turner

660

330

165

495

201820192020202120222023202429 163 326 462 530 656 241

Public access

1 article

0 articles

available

not available

Based on funding mandates

Alexander Turner

Unknown affiliation

Verified email at mit.edu


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Robustness may be at odds with accuracy D Tsipras, S Santurkar, L Engstrom, A Turner, A Madry arXiv preprint arXiv:1805.12152, 2018	1761	2018
Label-consistent backdoor attacks A Turner, D Tsipras, A Madry arXiv preprint arXiv:1912.02771, 2019	464*	2019
There is no free lunch in adversarial robustness (but there are unexpected benefits) D Tsipras, S Santurkar, L Engstrom, A Turner, A Madry arXiv preprint arXiv:1805.12152 2 (3), 2018	91	2018
Optimal policies tend to seek power AM Turner, L Smith, R Shah, A Critch, P Tadepalli arXiv preprint arXiv:1912.01683, 2019	51	2019
Robustness may be at odds with accuracy. arXiv D Tsipras, S Santurkar, L Engstrom, A Turner, A Madry arXiv preprint arXiv:1805.12152 10, 2018	23	2018
Parametrically retargetable decision-makers tend to seek power A Turner, P Tadepalli Advances in Neural Information Processing Systems 35, 31391-31401, 2022	11	2022
Steering llama 2 via contrastive activation addition N Rimsky, N Gabrieli, J Schulz, M Tong, E Hubinger, AM Turner arXiv preprint arXiv:2312.06681, 2023	8	2023
On avoiding power-seeking by artificial intelligence AM Turner arXiv preprint arXiv:2206.11831, 2022	2	2022
Understanding and Controlling a Maze-Solving Policy Network U Mini, P Grietzer, M Sharma, A Meek, M MacDiarmid, AM Turner arXiv preprint arXiv:2310.08043, 2023	1	2023
Formalizing the problem of side effect regularization AM Turner, A Saxena, P Tadepalli arXiv preprint arXiv:2206.11812, 2022	1	2022

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year