Follow
Dong Li
Dong Li
Shanghai AI Lab
Verified email at pjlab.org.cn - Homepage
Title
Cited by
Cited by
Year
Neural Architecture Search on Efficient Transformers and Beyond
Z Liu*, D Li*, K Lu*, Z Qin, W Sun, J Xu, Y Zhong
arXiv preprint arXiv:2207.13955, 2022
152022
Toeplitz neural network for sequence modeling
Z Qin, X Han, W Sun, B He, D Li, D Li, Y Dai
The Eleventh International Conference on Learning Representations 8, 2023
112023
Scaling TransNormer to 175 Billion Parameters
Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, F Yuan, X Luo, ...
arXiv preprint arXiv:2307.14995, 2023
72023
Fine-grained Audible Video Description
X Shen*, D Li*, J Zhou*, Z Qin, B He, X Han, A Li, Y Dai, L Kong, M Wang, ...
CVPR 2023, 2023
42023
Linear Video Transformer with Feature Fixation
K Lu, Z Liu, J Wang, W Sun, Z Qin, D Li, X Shen, H Deng, X Han, Y Dai, ...
arXiv preprint arXiv:2210.08164, 2022
42022
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Z Qin, W Sun, D Li, X Shen, W Sun, Y Zhong
arXiv preprint arXiv:2401.04658, 2024
12024
MAP: Low-data Regime Multimodal Learning with Adapter-based Pre-training and Prompting
W Li, D Li, W Li, Y Wang, H Jie, Y Zhong
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD …, 2023
12023
HGRN2: Gated Linear RNNs with State Expansion
Z Qin, S Yang, W Sun, X Shen, D Li, W Sun, Y Zhong
arXiv preprint arXiv:2404.07904, 2024
2024
Linear Attention Sequence Parallelism
W Sun, Z Qin, D Li, X Shen, Y Qiao, Y Zhong
arXiv preprint arXiv:2404.02882, 2024
2024
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
W Sun, Z Qin, W Sun, S Li, D Li, X Shen, Y Qiao, Y Zhong
arXiv preprint arXiv:2401.16265, 2024
2024
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, X Luo, Y Qiao, ...
2023
The system can't perform the operation now. Try again later.
Articles 1–11