Follow
Kaivalya Hariharan
Kaivalya Hariharan
MIT CSAIL
Verified email at mit.edu
Title
Cited by
Cited by
Year
Red teaming deep neural networks with feature synthesis tools
S Casper, T Bu, Y Li, J Li, K Zhang, K Hariharan, D Hadfield-Menell
Advances in Neural Information Processing Systems 36, 80470-80516, 2023
102023
Diagnostics for deep neural networks with automated copy/paste attacks
S Casper, K Hariharan, D Hadfield-Menell
arXiv preprint arXiv:2211.10024, 2022
92022
Forbidden Facts: An Investigation of Competing Objectives in Llama-2
TT Wang, M Wang, K Hariharan, N Shavit
arXiv preprint arXiv:2312.08793, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–3