Sparse MLP for image recognition: Is self-attention really necessary? C Tang, Y Zhao, G Wang, C Luo, W Xie, W Zeng Proceedings of the AAAI conference on artificial intelligence 36 (2), 2344-2351, 2022 | 81 | 2022 |
A battle of network structures: An empirical study of cnn, transformer, and mlp Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha arXiv preprint arXiv:2108.13002, 2021 | 80 | 2021 |
Joint time-frequency and time domain learning for speech enhancement C Tang, C Luo, Z Zhao, W Xie, W Zeng Proceedings of the twenty-ninth international conference on international …, 2021 | 67 | 2021 |
When shift operation meets vision transformer: An extremely simple alternative to attention mechanism G Wang, Y Zhao, C Tang, C Luo, W Zeng Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2423-2430, 2022 | 43 | 2022 |
Look before you match: Instance understanding matters in video object segmentation J Wang, D Chen, Z Wu, C Luo, C Tang, X Dai, Y Zhao, Y Xie, L Yuan, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 20 | 2023 |
RetrieverTTS: Modeling decomposed factors for text-based speech insertion D Yin, C Tang, Y Liu, X Wang, Z Zhao, Y Zhao, Z Xiong, S Zhao, C Luo arXiv preprint arXiv:2206.13865, 2022 | 10 | 2022 |
A battle of network structures: An empirical study of cnn, transformer, and mlp. arXiv 2021 Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha arXiv preprint arXiv:2108.13002, 0 | 10 | |
Streaming video model Y Zhao, C Luo, C Tang, D Chen, N Codella, ZJ Zha Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 9 | 2023 |
Zero-shot text-to-speech for text-based insertion in audio narration C Tang, C Luo, Z Zhao, D Yin, Y Zhao, W Zeng arXiv preprint arXiv:2109.05426, 2021 | 8 | 2021 |
TridentSE: Guiding speech enhancement with 32 global tokens D Yin, Z Zhao, C Tang, Z Xiong, C Luo arXiv preprint arXiv:2210.12995, 2022 | 7 | 2022 |
General-purpose speech representation learning through a self-supervised multi-granularity framework Y Zhao, D Yin, C Luo, Z Zhao, C Tang, W Zeng, ZJ Zha arXiv preprint arXiv:2102.01930, 2021 | 7 | 2021 |
A new frame interpolation method with pixel-level motion vector field C Tang, R Wang, W Wang, W Gao 2014 IEEE Visual Communications and Image Processing Conference, 350-353, 2014 | 6 | 2014 |
A Battle of Network Structures: An Empirical Study of CNN Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha Transformer, and MLP. arXiv 2108, 2021 | 5 | 2021 |
Method and system for video frame interpolation based on optical flow method T Chuanxin, R Wang, Z Wang, W Gao US Patent 10,531,093, 2020 | 5 | 2020 |
An anchor-free detector for continuous speech keyword spotting Z Zhao, C Tang, C Yao, C Luo arXiv preprint arXiv:2208.04622, 2022 | 2 | 2022 |
Frame interpolation with pixel-level motion vector field and mesh based hole filling C Tang, R Wang, Z Li, W Wang, W Gao CAAI Transactions on Intelligence Technology 1 (1), 72-78, 2016 | 2 | 2016 |
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation Y Wang, J Bao, W Weng, R Feng, D Yin, T Yang, J Zhang, QDZ Zhao, ... arXiv preprint arXiv:2311.18829, 2023 | | 2023 |
Speech enhancement T Chuanxin, Z Zhao, C Luo, W Zeng US Patent App. 17/927,861, 2023 | | 2023 |
Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss Z Zhao, L Wu, C Tang, D Yin, Y Zhao, C Luo ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
T2D: Spatiotemporal Feature Learning Based on Triple 2D Decomposition Y Zhao, C Luo, C Tang, D Chen, NC Codella, L Yuan, ZJ Zha | | 2022 |