Follow
Guangxuan Xiao
Guangxuan Xiao
Ph.D. student, MIT
Verified email at mit.edu - Homepage
Title
Cited by
Cited by
Year
SmoothQuant: Accurate and efficient post-training quantization for large language models
G Xiao, J Lin, M Seznec, H Wu, J Demouth, S Han
International Conference on Machine Learning, 38087-38099, 2023
4202023
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration
J Lin, J Tang, H Tang, S Yang, WM Chen, WC Wang, G Xiao, X Dang, ...
Proceedings of Machine Learning and Systems 6, 87-100, 2024
2672024
Efficient streaming language models with attention sinks
G Xiao, Y Tian, B Chen, S Han, M Lewis
International Conference on Learning Representations (ICLR), 2024
1722024
Fastcomposer: Tuning-free multi-subject image generation with localized attention
G Xiao, T Yin, WT Freeman, F Durand, S Han
arXiv preprint arXiv:2305.10431, 2023
922023
Red alarm for pre-trained models: Universal vulnerability to neuron-level backdoor attacks
Z Zhang, G Xiao, Y Li, T Lv, F Qi, Z Liu, Y Wang, X Jiang, M Sun
Machine Intelligence Research 20 (2), 180-193, 2023
732023
Offsite-tuning: Transfer learning without full model
G Xiao, J Lin, S Han
arXiv preprint arXiv:2302.04870, 2023
462023
Qserve: W4a8kv4 quantization and system co-design for efficient llm serving
Y Lin, H Tang, S Yang, Z Zhang, G Xiao, C Gan, S Han
arXiv preprint arXiv:2405.04532, 2024
62024
Bitdelta: Your fine-tune may only be worth one bit
J Liu, G Xiao, K Li, JD Lee, S Han, T Dao, T Cai
arXiv preprint arXiv:2402.10193, 2024
62024
Infllm: Unveiling the intrinsic capacity of llms for understanding extremely long sequences with training-free memory
C Xiao, P Zhang, X Han, G Xiao, Y Lin, Z Zhang, Z Liu, S Han, M Sun
arXiv preprint arXiv:2402.04617, 2024
62024
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training
K Huang, H Jiang, M Wang, G Xiao, D Wipf, X Song, Q Gan, Z Huang, ...
Proceedings of the VLDB Endowment 17 (6), 1473-1486, 2024
6*2024
Retrieval head mechanistically explains long-context factuality
W Wu, Y Wang, G Xiao, H Peng, Y Fu
arXiv preprint arXiv:2404.15574, 2024
52024
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
J Tang, Y Zhao, K Zhu, G Xiao, B Kasikci, S Han
arXiv preprint arXiv:2406.10774, 2024
22024
Sparse and Local Networks for Hypergraph Reasoning
G Xiao, LP Kaelbling, J Wu, J Mao
Learning on Graphs Conference, 34: 1-34: 16, 2022
2022
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
C Xiao, P Zhang, X Han, G Xiao, Y Lin, Z Zhang, Z Liu, M Sun
First Workshop on Long-Context Foundation Models@ ICML 2024, 0
The system can't perform the operation now. Try again later.
Articles 1–14