Guangxuan Xiao

Cited by

	All	Since 2019
Citations	1101	1101
h-index	6	6
i10-index	6	6

820

410

205

615

20222023202424 269 802

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Song HanMassachusetts Institute of TechnologyVerified email at mit.edu
Ji LinOpenAIVerified email at mit.edu
Haotian TangMIT, Nvidia | Previous: WaymoVerified email at mit.edu
Shang YangMassachusetts Institute of TechnologyVerified email at mit.edu
Wei-Chen WangPostdoctoral Research Associate, Massachusetts Institute of TechnologyVerified email at mit.edu
Mike LewisFacebook AI ResearchVerified email at fb.com
Beidi ChenCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Yuandong TianResearch Scientist, Meta AI (FAIR)Verified email at fb.com
Fredo DurandProfessor of Computer Science, MITVerified email at mit.edu
William T. FreemanProfessor of Computer Science, MITVerified email at mit.edu
Tianwei YinMITVerified email at mit.edu
Leslie KaelblingVerified email at csail.mit.edu
Jiajun WuStanford UniversityVerified email at cs.stanford.edu
Jiayuan MaoMIT CSAILVerified email at mit.edu

Guangxuan Xiao

Ph.D. student, MIT

Verified email at mit.edu - Homepage

Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
SmoothQuant: Accurate and efficient post-training quantization for large language models G Xiao, J Lin, M Seznec, H Wu, J Demouth, S Han International Conference on Machine Learning, 38087-38099, 2023	420	2023
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration J Lin, J Tang, H Tang, S Yang, WM Chen, WC Wang, G Xiao, X Dang, ... Proceedings of Machine Learning and Systems 6, 87-100, 2024	267	2024
Efficient streaming language models with attention sinks G Xiao, Y Tian, B Chen, S Han, M Lewis International Conference on Learning Representations (ICLR), 2024	172	2024
Fastcomposer: Tuning-free multi-subject image generation with localized attention G Xiao, T Yin, WT Freeman, F Durand, S Han arXiv preprint arXiv:2305.10431, 2023	92	2023
Red alarm for pre-trained models: Universal vulnerability to neuron-level backdoor attacks Z Zhang, G Xiao, Y Li, T Lv, F Qi, Z Liu, Y Wang, X Jiang, M Sun Machine Intelligence Research 20 (2), 180-193, 2023	73	2023
Offsite-tuning: Transfer learning without full model G Xiao, J Lin, S Han arXiv preprint arXiv:2302.04870, 2023	46	2023
Qserve: W4a8kv4 quantization and system co-design for efficient llm serving Y Lin, H Tang, S Yang, Z Zhang, G Xiao, C Gan, S Han arXiv preprint arXiv:2405.04532, 2024	6	2024
Bitdelta: Your fine-tune may only be worth one bit J Liu, G Xiao, K Li, JD Lee, S Han, T Dao, T Cai arXiv preprint arXiv:2402.10193, 2024	6	2024
Infllm: Unveiling the intrinsic capacity of llms for understanding extremely long sequences with training-free memory C Xiao, P Zhang, X Han, G Xiao, Y Lin, Z Zhang, Z Liu, S Han, M Sun arXiv preprint arXiv:2402.04617, 2024	6	2024
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training K Huang, H Jiang, M Wang, G Xiao, D Wipf, X Song, Q Gan, Z Huang, ... Proceedings of the VLDB Endowment 17 (6), 1473-1486, 2024	6*	2024
Retrieval head mechanistically explains long-context factuality W Wu, Y Wang, G Xiao, H Peng, Y Fu arXiv preprint arXiv:2404.15574, 2024	5	2024
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference J Tang, Y Zhao, K Zhu, G Xiao, B Kasikci, S Han arXiv preprint arXiv:2406.10774, 2024	2	2024
Sparse and Local Networks for Hypergraph Reasoning G Xiao, LP Kaelbling, J Wu, J Mao Learning on Graphs Conference, 34: 1-34: 16, 2022		2022
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory C Xiao, P Zhang, X Han, G Xiao, Y Lin, Z Zhang, Z Liu, M Sun First Workshop on Long-Context Foundation Models@ ICML 2024, 0

The system can't perform the operation now. Try again later.

Articles 1–14

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors