关注
Yucheng Zhao
Yucheng Zhao
MEGVII Technology
在 mail.ustc.edu.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
Omnivl: One foundation model for image-language and video-language tasks
J Wang, D Chen, Z Wu, C Luo, L Zhou, Y Zhao, Y Xie, C Liu, YG Jiang, ...
Advances in neural information processing systems 35, 5696-5710, 2022
962022
A battle of network structures: An empirical study of cnn, transformer, and mlp
Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha
arXiv preprint arXiv:2108.13002, 2021
772021
Sparse MLP for image recognition: Is self-attention really necessary?
C Tang, Y Zhao, G Wang, C Luo, W Xie, W Zeng
Proceedings of the AAAI conference on artificial intelligence 36 (2), 2344-2351, 2022
762022
When shift operation meets vision transformer: An extremely simple alternative to attention mechanism
G Wang, Y Zhao, C Tang, C Luo, W Zeng
Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2423-2430, 2022
422022
Self-supervised visual representations learning by contrastive mask prediction
Y Zhao, G Wang, C Luo, W Zeng, ZJ Zha
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
392021
Peripheral vision transformer
J Min, Y Zhao, C Luo, M Cho
Advances in Neural Information Processing Systems 35, 32097-32111, 2022
222022
Look before you match: Instance understanding matters in video object segmentation
J Wang, D Chen, Z Wu, C Luo, C Tang, X Dai, Y Zhao, Y Xie, L Yuan, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
162023
Multi-scale group transformer for long sequence modeling in speech separation
Y Zhao, C Luo, ZJ Zha, W Zeng
Proceedings of the Twenty-Ninth International Conference on International …, 2021
112021
RetrieverTTS: Modeling decomposed factors for text-based speech insertion
D Yin, C Tang, Y Liu, X Wang, Z Zhao, Y Zhao, Z Xiong, S Zhao, C Luo
arXiv preprint arXiv:2206.13865, 2022
102022
Zero-shot text-to-speech for text-based insertion in audio narration
C Tang, C Luo, Z Zhao, D Yin, Y Zhao, W Zeng
arXiv preprint arXiv:2109.05426, 2021
82021
Adriver-i: A general world model for autonomous driving
F Jia, W Mao, Y Liu, Y Zhao, Y Wen, C Zhang, X Zhang, T Wang
arXiv preprint arXiv:2311.13549, 2023
72023
General-purpose speech representation learning through a self-supervised multi-granularity framework
Y Zhao, D Yin, C Luo, Z Zhao, C Tang, W Zeng, ZJ Zha
arXiv preprint arXiv:2102.01930, 2021
72021
Streaming video model
Y Zhao, C Luo, C Tang, D Chen, N Codella, ZJ Zha
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
52023
Stream Query Denoising for Vectorized HD Map Construction
S Wang, F Jia, Y Liu, Y Zhao, Z Chen, T Wang, C Zhang, X Zhang, F Zhao
arXiv preprint arXiv:2401.09112, 2024
12024
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving
Y Wen, Y Zhao, Y Liu, F Jia, Y Wang, C Luo, C Zhang, T Wang, X Sun, ...
arXiv preprint arXiv:2311.16813, 2023
12023
VLM-Eval: A General Evaluation on Video Large Language Models
S Li, Y Zhang, Y Zhao, Q Wang, F Jia, Y Liu, T Wang
arXiv preprint arXiv:2311.11865, 2023
12023
Attention-Guided Contrastive Masked Image Modeling for Transformer-Based Self-Supervised Learning
Y Zhan, Y Zhao, C Luo, Y Zhang, X Sun
2023 IEEE International Conference on Image Processing (ICIP), 2490-2494, 2023
2023
Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss
Z Zhao, L Wu, C Tang, D Yin, Y Zhao, C Luo
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
2023
T2D: Spatiotemporal Feature Learning Based on Triple 2D Decomposition
Y Zhao, C Luo, C Tang, D Chen, NC Codella, L Yuan, ZJ Zha
2022
系统目前无法执行此操作,请稍后再试。
文章 1–19