Well-read students learn better: On the importance of pre-training compact models I Turc, MW Chang, K Lee, K Toutanova arXiv preprint arXiv:1908.08962, 2019 | 612 | 2019 |
Well-read students learn better: The impact of student initialization on knowledge distillation I Turc, MW Chang, K Lee, K Toutanova arXiv preprint arXiv:1908.08962 13, 2019 | 211 | 2019 |
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation JH Clark, D Garrette, I Turc, J Wieting Transactions of the Association for Computational Linguistics 10, 73-91, 2022 | 170 | 2022 |
Pix2struct: Screenshot parsing as pretraining for visual language understanding K Lee, M Joshi, IR Turc, H Hu, F Liu, JM Eisenschlos, U Khandelwal, ... International Conference on Machine Learning, 18893-18912, 2023 | 120 | 2023 |
Measuring attribution in natural language generation models H Rashkin, V Nikolaev, M Lamm, L Aroyo, M Collins, D Das, S Petrov, ... Computational Linguistics 49 (4), 777-840, 2023 | 100 | 2023 |
The multiberts: Bert reproductions for robustness analysis T Sellam, S Yadlowsky, J Wei, N Saphra, A D'Amour, T Linzen, J Bastings, ... arXiv preprint arXiv:2106.16163, 2021 | 77 | 2021 |
Revisiting the primacy of english in zero-shot cross-lingual transfer I Turc, K Lee, J Eisenstein, MW Chang, K Toutanova arXiv preprint arXiv:2106.16171, 2021 | 45 | 2021 |
Well-read students learn better: The impact of student initialization on knowledge distillation. CoRR abs/1908.08962 (2019) I Turc, MW Chang, K Lee, K Toutanova arXiv preprint arXiv:1908.08962, 2019 | 17 | 2019 |
Well-read students learn better: On the importance of pre-training compact models. arXiv 2019 I Turc, MW Chang, K Lee, K Toutanova arXiv preprint arXiv:1908.08962, 1908 | 17 | 1908 |
Well-read students learn better: On the importance of pre-training compact models. arXiv: Computation and Language I Turc, MW Chang, K Lee, K Toutanova | 6 | 2019 |
High performance natural language processing G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020 | 3 | 2020 |
Learning task sampling policy for multitask learning D Sundararaman, H Tsai, KH Lee, I Turc, L Carin Findings of the Association for Computational Linguistics: EMNLP 2021, 4410-4415, 2021 | 2 | 2021 |
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Understanding JH Clark, D Garrette, I Turc, J Wieting | | 2022 |
Recurrent Neural Networks for Statistical Machine Translation IR Turc University of Oxford, 2014 | | 2014 |