Follow
Torsten Hoefler
Title
Cited by
Cited by
Year
Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
T Ben-Nun, T Hoefler
ACM Computing Surveys (CSUR) 52 (4), 1-43, 2019
7552019
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
T Hoefler, D Alistarh, T Ben-Nun, N Dryden, A Peste
Journal of Machine Learning Research 22 (241), 1-124, 2021
5462021
The convergence of sparsified gradient methods
D Alistarh, T Hoefler, M Johansson, N Konstantinov, S Khirirat, C Renggli
Advances in Neural Information Processing Systems 31, 2018
5082018
MPI: A Message-Passing Interface Standard
MPI Forum
Technical Report, 2012
452*2012
Slim fly: A cost effective low-diameter network topology
M Besta, T Hoefler
SC'14: proceedings of the international conference for high performance …, 2014
3372014
Characterizing the influence of system noise on large-scale applications by simulation
T Hoefler, T Schneider, A Lumsdaine
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
3122010
Generic topology mapping strategies for large-scale parallel architectures
T Hoefler, M Snir
Proceedings of the international conference on Supercomputing, 75-84, 2011
2962011
The PERCS high-performance interconnect
B Arimilli, R Arimilli, V Chung, S Clark, W Denzel, B Drerup, T Hoefler, ...
2010 18th IEEE Symposium on High Performance Interconnects, 75-82, 2010
2952010
Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results
T Hoefler, R Belli
Proceedings of the international conference for high performance computing …, 2015
2872015
Implementation and performance analysis of non-blocking collective operations for MPI
T Hoefler, A Lumsdaine, W Rehm
Proceedings of the 2007 ACM/IEEE conference on Supercomputing, 1-10, 2007
2762007
Neural code comprehension: A learnable representation of code semantics
T Ben-Nun, AS Jakobovits, T Hoefler
Advances in neural information processing systems 31, 2018
2672018
LogGOPSim: simulating large-scale applications in the LogGOPS model
T Hoefler, T Schneider, A Lumsdaine
Proceedings of the 19th ACM International Symposium on High Performance …, 2010
2192010
Gptq: Accurate post-training quantization for generative pre-trained transformers
E Frantar, S Ashkboos, T Hoefler, D Alistarh
arXiv preprint arXiv:2210.17323, 2022
2152022
Augment your batch: Improving generalization through instance repetition
E Hoffer, T Ben-Nun, I Hubara, N Giladi, T Hoefler, D Soudry
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020
2102020
Dare: High-performance state machine replication on rdma networks
M Poke, T Hoefler
Proceedings of the 24th International Symposium on High-Performance Parallel …, 2015
1802015
Using automated performance modeling to find scalability bugs in complex codes
A Calotoiu, T Hoefler, M Poke, F Wolf
Proceedings of the International Conference on High Performance Computing …, 2013
1802013
Using advanced MPI: Modern features of the message-passing interface
W Gropp, T Hoefler, R Thakur, E Lusk
MIT Press, 2014
1782014
Graph of thoughts: Solving elaborate problems with large language models
M Besta, N Blach, A Kubicek, R Gerstenberger, L Gianinazzi, J Gajda, ...
arXiv preprint arXiv:2308.09687, 2023
1752023
To push or to pull: On reducing communication and synchronization in graph computations
M Besta, M Podstawski, L Groner, E Solomonik, T Hoefler
Proceedings of the 26th International Symposium on High-Performance Parallel …, 2017
1672017
Enabling highly-scalable remote memory access programming with MPI-3 one sided
R Gerstenberger, M Besta, T Hoefler
Proceedings of the International Conference on High Performance Computing …, 2013
1642013
The system can't perform the operation now. Try again later.
Articles 1–20