Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... arXiv preprint arXiv:2402.00159, 2024 | 28* | 2024 |
Olmo: Accelerating the science of language models D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ... arXiv preprint arXiv:2402.00838, 2024 | 25* | 2024 |
What's In My Big Data? Y Elazar, A Bhagia, I Magnusson, A Ravichander, D Schwenk, A Suhr, ... arXiv preprint arXiv:2310.20707, 2023 | 18 | 2023 |
HINT: Hypernetwork Instruction Tuning for Efficient Zero-& Few-Shot Generalisation H Ivison, A Bhagia, Y Wang, H Hajishirzi, M Peters arXiv preprint arXiv:2212.10315, 2022 | 14 | 2022 |
Findings of the WMT’22 shared task on large-scale machine translation evaluation for African languages D Adelani, MMI Alam, A Anastasopoulos, A Bhagia, MR Costa-jussà, ... Proceedings of the Seventh Conference on Machine Translation (WMT), 773-800, 2022 | 13 | 2022 |
Catwalk: A unified language model evaluation framework for many datasets D Groeneveld, A Awadalla, I Beltagy, A Bhagia, I Magnusson, H Peng, ... arXiv preprint arXiv:2312.10253, 2023 | 3 | 2023 |
Continued pretraining for better zero-and few-shot promptability Z Wu, RL Logan IV, P Walsh, A Bhagia, D Groeneveld, S Singh, I Beltagy arXiv preprint arXiv:2210.10258, 2022 | 3 | 2022 |
On advances in text generation from images beyond captioning: A case study in self-rationalization S Palaskar, A Bhagia, Y Bisk, F Metze, AW Black, A Marasović arXiv preprint arXiv:2205.11686, 2022 | 2 | 2022 |
Paloma: A Benchmark for Evaluating Language Model Fit I Magnusson, A Bhagia, V Hofmann, L Soldaini, AH Jha, O Tafjord, ... arXiv preprint arXiv:2312.10523, 2023 | | 2023 |
Robust Tooling and New Resources for Large Language Model Evaluation via Catwalk K Richardson, I Magnusson, O Tafjord, A Bhagia, I Beltagy, A Cohan, ... | | |