Spotserve: Serving generative large language models on preemptible instances X Miao, C Shi, J Duan, X Xi, D Lin, B Cui, Z Jia arXiv preprint arXiv:2311.15566, 2023 | 12 | 2023 |
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances J Duan, Z Song, X Miao, X Xi, D Lin, GH Xu, M Zhang, Z Jia | 1 | |
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning C Chen, X Li, Q Zhu, J Duan, P Sun, X Zhang, C Yang Proceedings of the 29th ACM International Conference on Architectural …, 2024 | | 2024 |
MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving J Duan, R Lu, H Duanmu, X Li, X Zhang, D Lin, I Stoica, H Zhang arXiv preprint arXiv:2404.02015, 2024 | | 2024 |
Proteus: Simulating the Performance of Distributed DNN Training J Duan, X Li, P Xu, X Zhang, S Yan, Y Liang, D Lin arXiv preprint arXiv:2306.02267, 2023 | | 2023 |