Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding S Wang, H Yang, X Wang, T Liu, P Wang, X Liang, K Ma, T Feng, X You, ... arXiv preprint arXiv:2402.15678, 2024 | 2 | 2024 |
dgQuEST: Accelerating Large Scale Quantum Circuit Simulation through Hybrid CPU-GPU Memory Hierarchies T Feng, S Chen, X You, S Zhong, H Yang, Z Luan, D Qian Network and Parallel Computing: 18th IFIP WG 10.3 International Conference …, 2022 | 1 | 2022 |
AtRec: Accelerating Recommendation Model Training on CPUs S Wang, T Feng, H Yang, X You, B Chen, T Liu, Z Luan, D Qian IEEE Transactions on Parallel and Distributed Systems, 2024 | | 2024 |
Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU J Liao, M Li, H Yang, Q Sun, B Sun, J Hao, T Feng, F Yu, S Chen, Y Tao, ... 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023 | | 2023 |