Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 740 | 2022 |
Holistic evaluation of language models P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ... arXiv preprint arXiv:2211.09110, 2022 | 612 | 2022 |
Holistic evaluation of language models, 2022 P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ... URL https://arxiv. org/abs/2211.09110, 2022 | 17 | 2022 |
Premise Order Matters in Reasoning with Large Language Models X Chen, RA Chi, X Wang, D Zhou https://arxiv.org/abs/2402.08939, 2024 | 2 | 2024 |
Stanford MLab at SemEval 2023 Task 7: Neural Methods for Clinical Trial Report NLI C Takehana, D Lim, E Kurtuluş, R Iyer, E Tanimura, P Aggarwal, ... Proceedings of the 17th International Workshop on Semantic Evaluation …, 2023 | 2 | 2023 |
Redwoodnlp at semeval-2021 task 7: Ensembled pretrained and lightweight models for humor detection N Chi, R Chi Proceedings of the 15th international workshop on semantic evaluation …, 2021 | 1 | 2021 |
Dialogue Distillery: Crafting Interpolable, Interpretable, and Introspectable Dialogue from LLMs RA Chi, J Kim, S Hickmann, S Li, G Chi, T Atchariyachanvanit, K Yu, ... Alexa Prize SocialBot Grand Challenge 5, 0 | 1 | |
MODELING: A Novel Dataset for Testing Linguistic Reasoning in Language Models N Chi, T Malchev, R Kong, R Chi, L Huang, E Chi, R McCoy, D Radev Proceedings of the 6th Workshop on Research in Computational Linguistic …, 2024 | | 2024 |
GLARE: Generative Left-to-right AdversaRial Examples RA Chi, N Kim, P Liu, Z Lack, EA Chi Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems …, 2022 | | 2022 |
Stanford MLab at SemEval 2022 Task 7: Tree-and Transformer-Based Methods for Clarification Plausibility T Yim, J Lee, R Verma, S Hickmann, A Zhu, C Sallade, I Ng, R Chi, P Liu Proceedings of the 16th International Workshop on Semantic Evaluation …, 2022 | | 2022 |
Automated Topic-Tagging for Software-Related Question-and-Answer Sites A Agrawal, RA Chi, V Gupta | | |