Follow
Muhammad Maaz
Muhammad Maaz
PhD Computer Vision at MBZUAI
Verified email at mbzuai.ac.ae - Homepage
Title
Cited by
Cited by
Year
Maple: Multi-modal prompt learning
MU Khattak, H Rasheed, M Maaz, S Khan, FS Khan
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
2652023
Video-chatgpt: Towards detailed video understanding via large vision and language models
M Maaz, H Rasheed, S Khan, FS Khan
arXiv preprint arXiv:2306.05424, 2023
1382023
Edgenext: efficiently amalgamated cnn-transformer architecture for mobile vision applications
M Maaz, A Shaker, H Cholakkal, S Khan, SW Zamir, RM Anwer, ...
European Conference on Computer Vision, 3-20, 2022
1322022
Bridging the gap between object and image-level representations for open-vocabulary detection
H Bangalath, M Maaz, MU Khattak, SH Khan, F Shahbaz Khan
Advances in Neural Information Processing Systems 35, 33781-33794, 2022
1012022
Class-agnostic object detection with multi-modal transformer
M Maaz, H Rasheed, S Khan, FS Khan, RM Anwer, MH Yang
European conference on computer vision, 512-531, 2022
73*2022
Fine-tuned clip models are efficient video learners
H Rasheed, MU Khattak, M Maaz, S Khan, FS Khan
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
562023
UNETR++: delving into efficient and accurate 3D medical image segmentation
A Shaker, M Maaz, H Rasheed, S Khan, MH Yang, FS Khan
arXiv preprint arXiv:2212.04497, 2022
462022
Glamm: Pixel grounding large multimodal model
H Rasheed, M Maaz, S Shaji, A Shaker, S Khan, H Cholakkal, RM Anwer, ...
arXiv preprint arXiv:2311.03356, 2023
222023
SwiftFormer: Efficient additive attention for transformer-based real-time mobile vision applications
A Shaker, M Maaz, H Rasheed, S Khan, MH Yang, FS Khan
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
172023
Pg-video-llava: Pixel grounding large video-language models
S Munasinghe, R Thushara, M Maaz, HA Rasheed, S Khan, M Shah, ...
arXiv preprint arXiv:2311.13435, 2023
42023
Self-supervised learning for fine-grained visual categorization
M Maaz, HA Rasheed, D Gaddam
arXiv preprint arXiv:2105.08788, 2021
22021
PALO: A Polyglot Large Multimodal Model for 5B People
M Maaz, H Rasheed, A Shaker, S Khan, H Cholakal, RM Anwer, ...
arXiv preprint arXiv:2402.14818, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–12