A unifying review of deep and shallow anomaly detection

L Ruff, JR Kauffmann, RA Vandermeulen… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Deep learning approaches to anomaly detection (AD) have recently improved the state of
the art in detection performance on complex data sets, such as large collections of images or …

Shortcut learning in deep neural networks

R Geirhos, JH Jacobsen, C Michaelis… - Nature Machine …, 2020 - nature.com
Deep learning has triggered the current rise of artificial intelligence and is the workhorse of
today's machine intelligence. Numerous success stories have rapidly spread all over …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org
Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

Generalized out-of-distribution detection: A survey

J Yang, K Zhou, Y Li, Z Liu - International Journal of Computer Vision, 2024 - Springer
Abstract Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of
machine learning systems. For instance, in autonomous driving, we would like the driving …

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

Red teaming language models with language models

E Perez, S Huang, F Song, T Cai, R Ring… - arXiv preprint arXiv …, 2022 - arxiv.org
Language Models (LMs) often cannot be deployed because of their potential to harm users
in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using …

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

N Sambasivan, S Kapania, H Highfill… - proceedings of the …, 2021 - dl.acm.org
AI models are increasingly applied in high-stakes domains like health and conservation.
Data quality carries an elevated significance in high-stakes AI due to its heightened …

[HTML][HTML] Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation

N Díaz-Rodríguez, J Del Ser, M Coeckelbergh… - Information …, 2023 - Elsevier
Abstract Trustworthy Artificial Intelligence (AI) is based on seven technical requirements
sustained over three main pillars that should be met throughout the system's entire life cycle …