Cola: Weakly-supervised temporal action localization with snippet contrastive learning C Zhang, M Cao, D Yang, J Chen, Y Zou Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 134 | 2021 |
On Pursuit of Designing Multi-modal Transformer for Video Grounding M Cao, L Chen, MZ Shou, C Zhang, Y Zou Conference on Empirical Methods in Natural Language Processing (EMNLP 2021 Oral), 2021 | 61 | 2021 |
LocVTP: Video-Text Pre-training for Temporal Localization M Cao, T Yang, J Weng, C Zhang, J Wang, Y Zou European Conference on Computer Vision (ECCV), 2022, 2022 | 39 | 2022 |
Unsupervised Pre-training for Temporal Action Localization Tasks C Zhang, T Yang, J Weng, M Cao, J Wang, Y Zou Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 32 | 2022 |
Deep Motion Prior for Weakly-Supervised Temporal Action Localization M Cao, C Zhang, L Chen, MZ Shou, Y Zou IEEE Transactions on Image Processing, 2022 | 19 | 2022 |
Non-local nested residual attention network for stereo image super-resolution W Xie, J Zhang, Z Lu, M Cao, Y Zhao ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 16 | 2020 |
RR-Net: Relation reasoning for end-to-end human-object interaction detection D Yang, Y Zou, C Zhang, M Cao, J Chen IEEE Transactions on Circuits and Systems for Video Technology 32 (6), 3853-3865, 2021 | 15* | 2021 |
UniFaceGAN: a unified framework for temporally consistent facial video editing M Cao, H Huang, H Wang, X Wang, L Shen, S Wang, L Bao, Z Li, J Luo IEEE Transactions on Image Processing 30, 6107-6116, 2021 | 15* | 2021 |
Exploring recommendation capabilities of gpt-4v (ision): A preliminary case study P Zhou*, M Cao*, YL Huang*, Q Ye*, P Zhang, J Liu, Y Xie, Y Hua, J Kim arXiv preprint arXiv:2311.04199, 2023 | 14 | 2023 |
Weakly labelled audio tagging via convolutional networks with spatial and channel-wise attention S Hong, Y Zou, W Wang, M Cao ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 14 | 2020 |
All you need is a second look: Towards arbitrary-shaped text detection M Cao, C Zhang, D Yang, Y Zou IEEE Transactions on Circuits and Systems for Video Technology 32 (2), 758-767, 2021 | 13 | 2021 |
GISCA: Gradient-inductive segmentation network with contextual attention for scene text detection M Cao, Y Zou, D Yang, C Liu IEEE Access 7, 62805-62816, 2019 | 13 | 2019 |
Qilin-med-vl: Towards chinese large vision-language model for general healthcare J Liu, Z Wang, Q Ye, D Chong, P Zhou, Y Hua arXiv preprint arXiv:2310.17956, 2023 | 12 | 2023 |
Qilin-med: Multi-stage knowledge injection advanced medical large language model Q Ye, J Liu, D Chong, P Zhou, Y Hua, F Liu, M Cao, Z Wang, X Cheng, ... arXiv preprint arXiv:2310.09089, 2023 | 11 | 2023 |
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory H Li, M Cao, X Cheng, Y Li, Z Zhu, Y Zou International Conference on Computer Vision (ICCV, Oral), 2023 | 11 | 2023 |
Generating templated caption for video grounding H Li, M Cao, X Cheng, Z Zhu, Y Li, Y Zou arXiv preprint arXiv 2301, 2, 2023 | 11 | 2023 |
Correspondence Matters for Video Referring Expression Comprehension M Cao, J Jiang, L Chen, Y Zou ACM International Conference on Multimedia (ACM MM), 2022, 2022 | 11 | 2022 |
All you need is a second look: Towards tighter arbitrary shape text detection M Cao, Y Zou ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 7 | 2020 |
Concept-aware video captioning: Describing videos with effective prior information B Yang, M Cao, Y Zou IEEE Transactions on Image Processing, 2023 | 6 | 2023 |
Iterative proposal refinement for weakly-supervised video grounding M Cao, F Wei, C Xu, X Geng, L Chen, C Zhang, Y Zou, T Shen, D Jiang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 6 | 2023 |