‪Gabriele Oliaro‬ - ‪Google Scholar‬

Get my own profile

Cited by

	All	Since 2019
Citations	138	138
h-index	4	4
i10-index	3	3

0

120

60

30

90

2022202320242 24 112

Public access

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zhihao JiaAssistant Professor of Computer Science, Carnegie Mellon UniversityVerified email at cmu.edu
Minlan YuHarvard UniversityVerified email at g.harvard.edu

Gabriele Oliaro

Gabriele Oliaro

Carnegie Mellon University

Verified email at cs.cmu.edu - Homepage

Machine Learning Distributed Systems Parallel Computing Networking


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Specinfer: Accelerating large language model serving with tree-based speculative inference and verification X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024	83	2024
Towards efficient generative large language model serving: A survey from algorithms to systems X Miao, G Oliaro, Z Zhang, X Cheng, H Jin, T Chen, Z Jia arXiv preprint arXiv:2312.15234, 2023	34	2023
Zero-CPU Collection with Direct Telemetry Access J Langlet, RB Basat, S Ramanathan, G Oliaro, M Mitzenmacher, M Yu, ... ACM Workshop on Hot Topics in Networks (HotNets '21), 108–115, 2021	13	2021
Direct Telemetry Access J Langlet, R Ben Basat, G Oliaro, M Mitzenmacher, M Yu, G Antichi ACM SIGCOMM 2023 Conference, 832-849, 2023	5	2023
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models Z Zhang, D Zhao, X Miao, G Oliaro, Q Li, Y Jiang, Z Jia arXiv preprint arXiv:2401.07159, 2024	2	2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning X Miao, G Oliaro, X Cheng, M Wu, C Unger, Z Jia arXiv preprint arXiv:2402.18789, 2024	1	2024
Optimal Kernel Orchestration for Tensor Programs with Korch M Hu, A Venkatram, S Biswas, B Marimuthu, B Hou, G Oliaro, H Wang, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–7