Awq: Activation-aware weight quantization for llm compression and acceleration J Lin, J Tang, H Tang, S Yang, X Dang, S Han arXiv preprint arXiv:2306.00978, 2023 | 144 | 2023 |
Flatformer: Flattened window attention for efficient point cloud transformer Z Liu, X Yang, H Tang, S Yang, S Han Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 34 | 2023 |
Heuristic adaptability to input dynamics for spmm on gpus G Dai, G Huang, S Yang, Z Yu, H Zhang, Y Ding, Y Xie, H Yang, Y Wang Proceedings of the 59th ACM/IEEE Design Automation Conference, 595-600, 2022 | 10 | 2022 |
Torchsparse++: Efficient training and inference framework for sparse convolution on gpus H Tang, S Yang, Z Liu, K Hong, Z Yu, X Li, G Dai, Y Wang, S Han Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023 | 9* | 2023 |
Hypergef: A framework enabling efficient fusion for hypergraph neural network on gpus Z Yu, G Dai, S Yang, G Zhang, H Zhang, F Zhu, J Yang, J Zhao, Y Wang Proceedings of Machine Learning and Systems 5, 2023 | 3 | 2023 |
Sparse Refinement for Efficient High-Resolution Semantic Segmentation Z Liu, Z Zhang, S Yang, H Tang, C Xu, K Keutzer, S Han | | 2023 |
CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory T Fu, C Wei, Z Zhu, S Yang, Z Yu, G Dai, H Yang, Y Wang 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1-6, 2023 | | 2023 |