Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1226 | 2023 |
The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only G Penedo, Q Malartic, D Hesslow, R Cojocaru, A Cappelli, H Alobeidli, ... arXiv preprint arXiv:2306.01116, 2023 | 410 | 2023 |
Falcon-40B: an open large language model with state-of-the-art performance E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ... arxiv, 2023 | 155 | 2023 |
What language model architecture and pretraining objective works best for zero-shot generalization? T Wang, A Roberts, D Hesslow, T Le Scao, HW Chung, I Beltagy, ... International Conference on Machine Learning, 22964-22984, 2022 | 100 | 2022 |
The falcon series of open language models E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ... arXiv preprint arXiv:2311.16867, 2023 | 97 | 2023 |
What language model to train if you have one million gpu hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... arXiv preprint arXiv:2210.15424, 2022 | 81 | 2022 |
Rita: a study on scaling up generative protein sequence models D Hesslow, N Zanichelli, P Notin, I Poli, D Marks arXiv preprint arXiv:2205.05789, 2022 | 46 | 2022 |
The falcon series of language models: Towards open frontier models E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, ... Hugging Face repository, 2023 | 24 | 2023 |
BLOOM: A 176b-parameter open-access multilingual language model. CoRR, abs/2211.05100, 2022. doi: 10.48550 T Le Scao, A Fan, C Akiki, E Pavlick, S Ilic, D Hesslow, R Castagné, ... arXiv preprint arXiv.2211.05100, 0 | 20 | |
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only. arXiv 2023 G Penedo, Q Malartic, D Hesslow, R Cojocaru, A Cappelli, H Alobeidli, ... arXiv preprint arXiv:2306.01116, 0 | 18 | |
The refinedweb dataset for falcon llm: Outperforming curated corpora with web data only G Penedo, Q Malartic, D Hesslow, R Cojocaru, H Alobeidli, A Cappelli, ... Advances in Neural Information Processing Systems 36, 2024 | 13 | 2024 |
Falcon-40B: an open large language model with state-of-the-art performance. 2023 E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ... URL https://falconllm. tii. ae, 2022 | 10 | 2022 |
Lighton optical processing unit: Scaling-up AI and HPC with a non von neumann co-processor C Brossollet, A Cappelli, I Carron, C Chaintoutis, A Chatelain, L Daudet, ... arXiv preprint arXiv:2107.11814, 2021 | 9 | 2021 |
Contrastive embeddings for neural architectures D Hesslow, I Poli arXiv preprint arXiv:2102.04208, 2021 | 6 | 2021 |
Falcon-40B: an open large language model with state-of-the-art performance H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, E Goffinet, ... | 5 | 2023 |
Is the number of trainable parameters all that actually matters? A Chatelain, A Djeghri, D Hesslow, J Launay I (Still) Can't Believe It's Not Better! Workshop at NeurIPS 2021, 27-32, 2022 | 4 | 2022 |
Building a Swedish question-answering model H von Essen, D Hesslow Proceedings of the Probability and Meaning Conference (PaM 2020), 117-127, 2020 | 3 | 2020 |
Linear optical random projections without holography R Ohana, D Hesslow, D Brunner, S Gigan, K Müller Optics Express 31 (16), 25881-25888, 2023 | 2 | 2023 |
Method and system for machine learning using optical data I Poli, J Launay, K Müller, G Pariente, I Carron, L Daudet, R Ohana, ... US Patent 11,574,178, 2023 | 2 | 2023 |
Artificial Neural Network Training on an Optical Processor via Direct Feedback Alignment K Müller, J Launay, I Poli, M Filipovich, A Capelli, D Hesslow, I Carron, ... The European Conference on Lasers and Electro-Optics, jsiii_3_3, 2023 | 1 | 2023 |