Follow
Jiangfei Duan
Jiangfei Duan
Verified email at ie.cuhk.edu.hk - Homepage
Title
Cited by
Cited by
Year
Spotserve: Serving generative large language models on preemptible instances
X Miao, C Shi, J Duan, X Xi, D Lin, B Cui, Z Jia
Proceedings of the 29th ACM International Conference on Architectural …, 2024
222024
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
H Duanmu, Z Yuan, X Li, J Duan, X Zhang, D Lin
arXiv preprint arXiv:2405.06219, 2024
22024
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
C Chen, X Li, Q Zhu, J Duan, P Sun, X Zhang, C Yang
Proceedings of the 29th ACM International Conference on Architectural …, 2024
22024
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
J Duan, Z Song, X Miao, X Xi, D Lin, H Xu, M Zhang, Z Jia
21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024
22024
Proteus: Simulating the Performance of Distributed DNN Training
J Duan, X Li, P Xu, X Zhang, S Yan, Y Liang, D Lin
arXiv preprint arXiv:2306.02267, 2023
12023
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
J Duan, R Lu, H Duanmu, X Li, X Zhang, D Lin, I Stoica, H Zhang
Forty-first International Conference on Machine Learning, 0
1*
Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Q Zhu, J Duan, C Chen, S Liu, X Li, G Feng, X Lv, H Cao, X Chuanfu, ...
arXiv preprint arXiv:2406.15486, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–7