Follow
Wout Schellaert
Title
Cited by
Cited by
Year
Rethink reporting of evaluation results in AI
R Burnell, W Schellaert, J Burden, TD Ullman, F Martinez-Plumed, ...
Science 380 (6641), 136-138, 2023
43*2023
Training on the Test Set: Mapping the System-Problem Space in AI
J Hernández-Orallo, W Schellaert, F Martínez-Plumed
Proceedings of the AAAI Conference on Artificial Intelligence 36 (11), 12256 …, 2022
52022
Reject Before You Run: Small Assessors Anticipate Big Language Models
L Zhou, F Martínez-Plumed, J Hernández-Orallo, C Ferri, W Schellaert
Evaluation Beyond Metrics Workshop @ IJCAI, 2022
52022
Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models
W Schellaert, F Martínez-Plumed, K Vold, J Burden, PAM Casares, ...
Journal of Artificial Intelligence Research 77, 377-394, 2023
3*2023
Animal-AI 3: What's New & Why You Should Care
K Voudouris, I Alhas, W Schellaert, M Crosby, J Holmes, J Burden, ...
arXiv preprint arXiv:2312.11414, 2023
12023
Predictable Artificial Intelligence
L Zhou, PA Moreno-Casares, F Martínez-Plumed, J Burden, R Burnell, ...
arXiv preprint arXiv:2310.06167, 2023
12023
A Proposal for Scaling the Scaling Laws
W Schellaert, R Hamon, F Martínez-Plumed, J Hernandez-Orallo
Proceedings of the First edition of the Workshop on the Scaling Behavior of …, 2024
2024
Assessing AI capabilities with education tests
M Staneva, A Baret, Á Aso-Mollar, J Blass, SC Ponz, V Conitzer, U Cortes, ...
OECD, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–8