Mantas Mazeika

Cited by

	All	Since 2019
Citations	6767	6741
h-index	15	15
i10-index	17	17

2600

1300

650

1950

201920202021202220232024131 449 906 1258 2527 1451

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Dan HendrycksDirector of the Center for AI SafetyVerified email at berkeley.edu
Dawn SongProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Andy ZouPhD Student, Carnegie Mellon UniversityVerified email at andrew.cmu.edu
Bo LiUniversity of Illinois at Urbana–ChampaignVerified email at illinois.edu
David ForsythProfessor of Computer Science, University of Illinois, Urbana ChampaignVerified email at uiuc.edu

Mantas Mazeika

University of Illinois Urbana-Champaign

Verified email at illinois.edu

ML Safety AI Safety Machine Ethics ML Reliability


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deep anomaly detection with outlier exposure D Hendrycks, M Mazeika, T Dietterich arXiv preprint arXiv:1812.04606, 2018	1468	2018
Measuring massive multitask language understanding D Hendrycks, C Burns, S Basart, A Zou, M Mazeika, D Song, J Steinhardt arXiv preprint arXiv:2009.03300, 2020	1112	2020
Using self-supervised learning can improve model robustness and uncertainty D Hendrycks, M Mazeika, S Kadavath, D Song Advances in neural information processing systems 32, 2019	959	2019
Using pre-training can improve model robustness and uncertainty D Hendrycks, K Lee, M Mazeika International Conference on Machine Learning, 2712-2721, 2019	751	2019
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022	734	2022
Using trusted data to train deep networks on labels corrupted by severe noise D Hendrycks, M Mazeika, D Wilson, K Gimpel Advances in neural information processing systems 31, 2018	595	2018
Scaling out-of-distribution detection for real-world settings D Hendrycks, S Basart, M Mazeika, M Mostajabi, J Steinhardt, D Song arXiv preprint arXiv:1911.11132, 2019	306	2019
Measuring coding challenge competence with apps D Hendrycks, S Basart, S Kadavath, M Mazeika, A Arora, E Guo, C Burns, ... arXiv preprint arXiv:2105.09938, 2021	291	2021
Decodingtrust: A comprehensive assessment of trustworthiness in gpt models B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu, Z Xiong, R Dutta, ... arXiv preprint arXiv:2306.11698, 2023	112	2023
Pixmix: Dreamlike pictures comprehensively improve safety measures D Hendrycks, A Zou, M Mazeika, L Tang, B Li, D Song, J Steinhardt Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	87	2022
A benchmark for anomaly segmentation D Hendrycks, S Basart, M Mazeika, M Mostajabi, J Steinhardt, D Song	66	2019
An overview of catastrophic ai risks D Hendrycks, M Mazeika, T Woodside arXiv preprint arXiv:2306.12001, 2023	65	2023
Representation engineering: A top-down approach to ai transparency A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... arXiv preprint arXiv:2310.01405, 2023	64	2023
X-risk analysis for ai research D Hendrycks, M Mazeika arXiv preprint arXiv:2206.05862, 2022	53	2022
What would jiminy cricket do? towards agents that behave morally D Hendrycks, M Mazeika, A Zou, S Patel, C Zhu, J Navarro, D Song, B Li, ... arXiv preprint arXiv:2110.13136, 2021	44	2021
Forecasting Future World Events with Neural Networks A Zou, T Xiao, R Jia, J Kwon, M Mazeika, R Li, D Song, J Steinhardt, ... arXiv preprint arXiv:2206.15474, 2022	13	2022
How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection M Mazeika, B Li, D Forsyth International Conference on Machine Learning, 15241-15254, 2022	12	2022
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios M Mazeika, E Tang, A Zou, S Basart, JS Chan, D Song, D Forsyth, ... arXiv preprint arXiv:2210.10039, 2022	8	2022
The trojan detection challenge M Mazeika, D Hendrycks, H Li, X Xu, S Hough, A Zou, A Rajabi, Q Yao, ... NeurIPS 2022 Competition Track, 279-291, 2022	5	2022
The singular value decomposition and low rank approximation M Maezika Technical Report, 2016	5	2016

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors