Solving quantitative reasoning problems with language models A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ... Advances in Neural Information Processing Systems 35, 3843-3857, 2022 | 385 | 2022 |
Linear Transformers are Secretly Fast Weight Programmers I Schlag*, K Irie*, J Schmidhuber International Conference on Machine Learning, 9355-9366, 2021 | 174* | 2021 |
Block-Recurrent Transformers DL Hutchins*, I Schlag*, Y Wu, E Dyer, B Neyshabur arXiv preprint arXiv:2203.07852, 2022 | 80 | 2022 |
Learning to reason with third order tensor products I Schlag, J Schmidhuber Advances in neural information processing systems 31, 9981-9993, 2018 | 78 | 2018 |
Enhancing the transformer with explicit relational encoding for math problem solving I Schlag, P Smolensky, R Fernandez, N Jojic, J Schmidhuber, J Gao arXiv preprint arXiv:1910.06611, 2019 | 65 | 2019 |
Going beyond linear transformers with recurrent fast weight programmers K Irie*, I Schlag*, R Csordás, J Schmidhuber Advances in Neural Information Processing Systems 34, 2021 | 55 | 2021 |
Learning Associative Inference Using Fast Weight Memory I Schlag, T Munkhdalai, J Schmidhuber International Conference on Learning Representations, 2021 | 39 | 2021 |
Ancient Roman coin recognition in the wild using deep learning based recognition of artistically depicted face profiles I Schlag, O Arandjelovic Proceedings of the IEEE International Conference on Computer Vision …, 2017 | 37 | 2017 |
Gated fast weights for on-the-fly neural program generation I Schlag, J Schmidhuber NIPS Metalearning Workshop, 2017 | 31 | 2017 |
A Modern Self-Referential Weight Matrix That Learns to Modify Itself K Irie, I Schlag, R Csordás, J Schmidhuber Deep RL Workshop NeurIPS 2021, 2021 | 28 | 2021 |
Solving quantitative reasoning problems with language models, 2022 A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ... URL https://arxiv. org/abs/2206.14858, 0 | 28* | |
Mindstorms in Natural Language-Based Societies of Mind M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ... arXiv preprint arXiv:2305.17066, 2023 | 25 | 2023 |
Large Language Model Programs I Schlag, S Sukhbaatar, A Celikyilmaz, W Yih, J Weston, J Schmidhuber, ... arXiv preprint arXiv:2305.05364, 2023 | 12 | 2023 |
Improving Baselines in the Wild K Irie, I Schlag, R Csordás, J Schmidhuber NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and …, 2021 | 2 | 2021 |
Augmenting Classic Algorithms with Neural Components for Strong Generalisation on Ambiguous and High-Dimensional Data I Schlag, J Schmidhuber Advances in Programming Languages and Neurosymbolic Systems Workshop, 2021 | 1 | 2021 |
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute A Stanić, D Ashley, O Serikov, L Kirsch, F Faccio, J Schmidhuber, ... arXiv preprint arXiv:2309.11197, 2023 | | 2023 |
Fast weight programmers for greater systematic generalisation in language I Schlag | | 2023 |
IDSIA/modern-srwm: Official repository for the paper" A Modern Self-Referential Weight Matrix That Learns to Modify Itself"(ICML 2022 & NeurIPS 2021 Deep RL Workshop) K Irie, I Schlag, R Csordás, J Schmidhuber Github, 2021 | | 2021 |
Gated Fast Weights for Associative Retrieval I Schlag, J Schmidhuber | | 2017 |