X-clip: End-to-end multi-grained contrastive learning for video-text retrieval Y Ma, G Xu, X Sun, M Yan, J Zhang, R Ji Proceedings of the 30th ACM International Conference on Multimedia (ACM MM …, 2022 | 136 | 2022 |
Towards local visual modeling for image captioning Y Ma, J Ji, X Sun, Y Zhou, R Ji Pattern Recognition (PR) 138, 109420, 2023 | 30 | 2023 |
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance Y Ma, X Zhang, X Sun, J Ji, H Wang, G Jiang, W Zhuang, R Ji Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 25 | 2023 |
Knowing what to learn: a metric-oriented focal mechanism for image captioning J Ji*, Y Ma*, X Sun, Y Zhou, Y Wu, R Ji IEEE Transactions on Image Processing (IEEE TIP) 31, 4321-4335, 2022 | 23 | 2022 |
Knowing what it is: semantic-enhanced dual attention transformer Y Ma, J Ji, X Sun, Y Zhou, Y Wu, F Huang, R Ji IEEE Transactions on Multimedia (IEEE TMM), 2022 | 16 | 2022 |
Beyond first impressions: Integrating joint multi-modal cues for comprehensive 3d representation H Wang, J Tang, J Ji, X Sun, R Zhang, Y Ma, M Zhao, L Li, Z Zhao, T Lv, ... Proceedings of the 31st ACM International Conference on Multimedia (ACM MM …, 2023 | 5 | 2023 |
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation Y Ma, Y Fan, J Ji, H Wang, X Sun, G Jiang, A Shu, R Ji arXiv preprint arXiv:2312.00085, 2023 | 2 | 2023 |
Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval Y Ma, X Sun, J Ji, G Jiang, W Zhuang, R Ji Proceedings of the 31st ACM International Conference on Multimedia (ACM MM …, 2023 | 2 | 2023 |
Semi-Supervised Panoptic Narrative Grounding D Yang, J Ji, X Sun, H Wang, Y Li, Y Ma, R Ji Proceedings of the 31st ACM International Conference on Multimedia (ACM MM …, 2023 | 2 | 2023 |
JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues J Ji, H Wang, C Wu, Y Ma, X Sun, R Ji arXiv preprint arXiv:2310.09503, 2023 | 1 | 2023 |
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation Y Ma, Z Lin, J Ji, Y Fan, X Sun, R Ji International Conference on Machine Learning (ICML), 2024, 2024 | | 2024 |
Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation T Guo, H Wang, Y Ma, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (3), 1985-1993, 2024 | | 2024 |
X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks Z Qian, Y Ma, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (5), 4551-4559, 2024 | | 2024 |
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation S Liu, Y Ma, X Zhang, H Wang, J Ji, X Sun, R Ji Conference on Computer Vision and Pattern Recognition (CVPR), 2024 | | 2024 |
3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation C Wu*, Y Ma*, Q Chen, H Wang, G Luo, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023 | | 2023 |