Follow
Runsen Xu
Title
Cited by
Cited by
Year
PointLLM: Empowering Large Language Models to Understand Point Clouds
R Xu, X Wang, T Wang, Y Chen, J Pang, D Lin
European Conference on Computer Vision (ECCV) Best Paper Candidate, 2024, 2023
2862023
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
T Wang, X Mao, C Zhu, R Xu, R Lyu, P Li, X Chen, W Zhang, K Chen, ...
Computer Vision and Pattern Recognition (CVPR), 2024, 2023
1432023
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
H Huang, Y Chen, Z Wang, R Huang, R Xu, T Wang, L Liu, X Cheng, ...
Neural Information Processing Systems (NeurIPS), 2024, 2024
1042024
RNIN-VIO: Robust Neural Inertial Navigation Aided Visual-Inertial Odometry in Challenging Scenes
D Chen, N Wang, R Xu, W Xie, H Bao, G Zhang
International Symposium on Mixed and Augmented Reality (ISMAR) Oral, 2021, 2021
812021
Grounded 3D-LLM with Referent Tokens
Y Chen, S Yang, H Huang, T Wang, R Xu, R Lyu, D Lin, J Pang
arXiv preprint arXiv:2405.10370, 2024
792024
Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator
X Wang, R Xu, Z Cui, Z Wan, Y Zhang
Neural Information Processing Systems (NeurIPS), 2023, 2023
602023
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
R Xu, T Wang, W Zhang, R Chen, J Cao, J Pang, D Lin
Computer Vision and Pattern Recognition (CVPR), 2023, 2023
422023
CO^ 3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving
R Chen, Y Mu, R Xu, W Shao, C Jiang, H Xu, Z Li, P Luo
International Conference on Learning Representations (ICLR), 2023, 2022
372022
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
R Xu, Z Huang, T Wang, Y Chen, J Pang, D Lin
Conference on Robot Learning (CoRL), 2024, 2024
342024
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
S Yang*, R Xu*, Y Xie, S Yang, M Li, J Lin, C Zhu, X Chen, H Duan, X Yue, ...
arXiv preprint arXiv:2505.23764, 2025
322025
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
R Lyu, T Wang, J Lin, S Yang, X Mao, Y Chen, R Xu, H Huang, C Zhu, ...
Neural Information Processing Systems (NeurIPS), 2024, 2024
282024
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
R Xu, W Wang, H Tang, X Chen, X Wang, FJ Chu, D Lin, M Feiszli, ...
arXiv preprint arXiv:2505.17015, 2025
232025
LIFE: Lighting Invariant Flow Estimation
Z Huang, X Pan, R Xu, Y Xu, G Zhang, H Li
arXiv preprint arXiv:2104.03097, 2021
132021
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
X Fu, X Wang, X Liu, J Bai, R Xu, P Wan, D Zhang, D Lin
arXiv preprint arXiv:2506.01943, 2025
102025
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
JL Lin, C Zhu, R Xu, X Mao, X Liu, T Wang, J Pang
Neural Information Processing Systems (NeurIPS), 2025, 2025
42025
PointLLM-V2: Empowering Large Language Models to Better Understand Point Clouds
R Xu, S Yang, X Wang, T Wang, Y Chen, J Pang, D Lin
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025, 2025
32025
VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
S Yang, R Xu, C Cui, T Wang, D Lin, J Pang
International Conference on Computer Vision (ICCV), 2025, 2025
12025
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
J Lin*, R Xu*, S Zhu, S Yang, P Cao, Y Ran, M Hu, C Zhu, Y Xie, Y Long, ...
arXiv preprint arXiv:2512.10863, 2025
2025
GVLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
W Hu, J Lin, Y Long, Y Ran, L Jiang, Y Wang, C Zhu, R Xu, T Wang, ...
arXiv preprint arXiv:2511.21688, 2025
2025
ChangingGrounding: 3D Visual Grounding in Changing Scenes
M Hu, Z Huang, T Wang, J Pang, D Lin, N Zheng, R Xu
arXiv preprint arXiv:2510.14965, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–20