Tony Wang

Cited by

	All	Since 2019
Citations	327	327
h-index	4	4
i10-index	3	3

220

110

165

20212022202320241 5 106 213

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Adam GleaveCEO at FAR AIVerified email at far.ai
Tom TsengFAR AIVerified email at far.ai
Michael DennisGoogle DeepMindVerified email at cs.berkeley.edu
Sergey LevineUC Berkeley, Physical IntelligenceVerified email at eecs.berkeley.edu
Stuart RussellProfessor of Computer Science, University of California, BerkeleyVerified email at cs.berkeley.edu
Yawen DuanUniversity of CambridgeVerified email at cam.ac.uk
Yuheng BuUniversity of FloridaVerified email at ufl.edu
Gregory WornellProfessor, Electrical Engineering and Computer Science, MITVerified email at mit.edu
Nir Shavit

Tony Wang

PhD student, MIT

Verified email at mit.edu - Homepage

artificial intelligence ai safety


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023	259	2023
Adversarial Policies Beat Superhuman Go AIs TT Wang, A Gleave, N Belrose, T Tseng, J Miller, MD Dennis, Y Duan, ... arXiv preprint arXiv:2211.00241, 2022	45*	2022
Neural-guided, bidirectional program search for abstraction and reasoning S Alford, A Gandhi, A Rangamani, A Banburski, T Wang, S Dandekar, ... Complex Networks & Their Applications X: Volume 1, Proceedings of the Tenth …, 2022	15	2022
SDP Methods for Sensitivity-Constrained Privacy Funnel and Information Bottleneck Problems Y Bu, T Wang, GW Wornell 2021 IEEE International Symposium on Information Theory (ISIT), 49-54, 2021	6	2021
Forbidden Facts: An Investigation of Competing Objectives in Llama-2 TT Wang, M Wang, K Hariharan, N Shavit arXiv preprint arXiv:2312.08793, 2023	2	2023
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation D Halawi, A Wei, E Wallace, TT Wang, N Haghtalab, J Steinhardt arXiv preprint arXiv:2406.20053, 2024		2024
Can Go AIs be adversarially robust? T Tseng, E McLean, K Pelrine, TT Wang, A Gleave arXiv preprint arXiv:2406.12843, 2024		2024
A connectomics-driven analysis reveals novel characterization of border regions in mouse visual cortex N Tumma, L Kong, S Sawmya, TT Wang, N Shavit bioRxiv, 2024.05. 24.595837, 2024		2024
Cliff-Learning TT Wang, I Zablotchi, N Shavit, JS Rosenfeld arXiv preprint arXiv:2302.07348, 2023		2023
Adversarial Examples in Simpler Settings TT Wang Massachusetts Institute of Technology, 2021		2021

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors