Follow
Vijay Anand Korthikanti
Vijay Anand Korthikanti
Principal Research Scientist, Nvidia
Verified email at uiuc.edu
Title
Cited by
Cited by
Year
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
6152022
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
5962021
Reducing activation recomputation in large transformer models
VA Korthikanti, J Casper, S Lym, L McAfee, M Andersch, M Shoeybi, ...
Proceedings of Machine Learning and Systems 5, 341-353, 2023
1782023
Synthesizing geometry constructions
S Gulwani, VA Korthikanti, A Tiwari
ACM SIGPLAN Notices 46 (6), 50-61, 2011
1752011
Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Using deepspeed and megatron to train megatron-turing nlg 530b, a large …, 2022
1372022
Towards optimizing energy costs of algorithms for shared memory architectures
VA Korthikanti, G Agha
Proceedings of the twenty-second annual ACM symposium on Parallelism in …, 2010
742010
Reasoning about MDPs as transformers of probability distributions
VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2010 Seventh International Conference on the Quantitative Evaluation of …, 2010
562010
Analysis of parallel algorithms for energy conservation in scalable multicore architectures
VA Korthikanti, G Agha
2009 International Conference on Parallel Processing, 212-219, 2009
562009
Model checking MDPs with a unique compact invariant set of distributions
R Chadha, VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2011 Eighth International Conference on Quantitative Evaluation of SysTems …, 2011
252011
An Empirical Study of Mamba-based Language Models
R Waleffe, W Byeon, D Riach, B Norick, V Korthikanti, T Dao, A Gu, ...
arXiv preprint arXiv:2406.07887, 2024
222024
Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning
Z Yang, W Ping, Z Liu, V Korthikanti, W Nie, DA Huang, L Fan, Z Yu, S Lan, ...
arXiv preprint arXiv:2302.04858, 2023
222023
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv 2022
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 0
21
Fair k mutual exclusion algorithm for peer to peer systems
VA Reddy, P Mittal, I Gupta
2008 The 28th International Conference on Distributed Computing Systems, 655-662, 2008
202008
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Preprint published online January 28, 2022
142022
Energy-performance trade-off analysis of parallel algorithms
VA Korthikanti, G Agha
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2010
132010
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
A large-scale generative language model, 2022
122022
On the energy complexity of parallel algorithms
VA Korthikanti, G Agha, M Greenstreet
2011 International Conference on Parallel Processing, 562-570, 2011
112011
Avoiding energy wastage in parallel applications
VA Korthikanti, G Agha
International Conference on Green Computing, 149-163, 2010
112010
An efficient algorithm to reduce test power consumption by scan cell and scan vector reordering
KVA Reddy, S Chattopadahyay
Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004 …, 2004
102004
Energy bounded scalability analysis of parallel algorithms
VA Korthikanti, GA Agha
92009
The system can't perform the operation now. Try again later.
Articles 1–20