Jie Lei 雷杰

Cited by

	All	Since 2019
Citations	3764	3757
h-index	19	19
i10-index	22	22

1300

650

325

975

20192020202120222023202447 115 353 777 1291 1116

Public access

View all

9 articles

1 article

available

not available

Based on funding mandates

Co-authors

Mohit BansalParker Distinguished Professor, Computer Science, UNC Chapel HillVerified email at cs.unc.edu
Tamara L BergAssociate Professor, Computer Science, UNC Chapel HillVerified email at cs.unc.edu
Licheng Yu 虞立成Research Scientist and Manager, Facebook AIVerified email at fb.com
Linjie (Lindsey) LiPrincipal Researcher, MicrosoftVerified email at microsoft.com
Zhe GanResearch Scientist, AppleVerified email at apple.com
Luowei ZhouResearch Scientist, Google DeepmindVerified email at google.com
Hao TanAdobe ResearchVerified email at adobe.com
Jaemin ChoPhD Student at UNC Chapel HillVerified email at cs.unc.edu
Gedas BertasiusAssistant Professor, University of North Carolina at Chapel HillVerified email at cs.unc.edu
Yelong ShenMicrosoftVerified email at microsoft.com
Liwei WangAssistant Professor at The Chinese University of Hong KongVerified email at cse.cuhk.edu.hk
Dong Yu (俞栋)Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA FellowVerified email at global.tencent.com
Zineng TangUC BerkeleyVerified email at cs.unc.edu
Thomas WolfCo-founder at HuggingFaceVerified email at polytechnique.edu
Yang WangComputer Science, Concordia UniversityVerified email at concordia.ca

Jie Lei 雷杰

Research Scientist, Meta AI

Verified email at fb.com - Homepage

Computer Vision Natural Language Processing Vision and Language


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling J Lei, L Li, L Zhou, Z Gan, TL Berg, M Bansal, J Liu CVPR 2021, Best Student Paper Honorable Mention, 2021	708	2021
TVQA: Localized, compositional video question answering J Lei, L Yu, M Bansal, TL Berg EMNLP 2018, 2018	677	2018
Unifying vision-and-language tasks via text generation J Cho, J Lei, H Tan, M Bansal ICML 2021, 2021	538	2021
Tvr: A large-scale dataset for video-subtitle moment retrieval J Lei, L Yu, TL Berg, M Bansal ECCV 2020, 2020	274	2020
TVQA+: Spatio-temporal grounding for video question answering J Lei, L Yu, TL Berg, M Bansal ACL 2020, 2020	251	2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning J Lei, L Wang, Y Shen, D Yu, TL Berg, M Bansal ACL 2020, 2020	207	2020
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries J Lei, TL Berg, M Bansal NeurIPS 2021, 2021	202*	2021
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners Z Wang, M Li, R Xu, L Zhou, J Lei, X Lin, S Wang, Z Yang, C Zhu, ... NeurIPS 2022, 2022	116	2022
Revealing single frame bias for video-and-language learning J Lei, TL Berg, M Bansal ACL 2023, 2022	114	2022
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation L Li, J Lei, Z Gan, L Yu, YC Chen, R Pillai, Y Cheng, L Zhou, XE Wang, ... NeurIPS 2021 Datasets and Benchmarks Track, 2021	113	2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning H Tan, J Lei, T Wolf, M Bansal CVPR 2022 workshop on Transformers for Vision, 2021	83	2021
VindLU: A Recipe for Effective Video-and-Language Pretraining F Cheng, X Wang, J Lei, D Crandall, M Bansal, G Bertasius CVPR 2023, 2022	74	2022
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models L Li, J Lei, Z Gan, J Liu ICCV 2021, 2021	72	2021
DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization Z Tang, J Lei, M Bansal NAACL 2021, 2021	67	2021
What is More Likely to Happen Next? Video-and-Language Future Event Prediction J Lei, L Yu, TL Berg, M Bansal EMNLP 2020, 2020	67	2020
Vision Transformers are Parameter-Efficient Audio-Visual Learners YB Lin, YL Sung, J Lei, M Bansal, G Bertasius CVPR 2023, 2022	64	2022
RESIN-11: Schema-guided event prediction for 11 newsworthy scenarios X Du, Z Zhang, S Li, P Yu, H Wang, T Lai, X Lin, Z Wang, I Liu, B Zhou, ... Proceedings of the 2022 Conference of the North American Chapter of the …, 2022	37	2022
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound YB Lin, J Lei, M Bansal, G Bertasius ECCV 2022 Oral, 2022	37	2022
Weakly supervised image classification with coarse and fine labels J Lei, Z Guo, Y Wang 2017 14th conference on computer and robot vision (crv), 240-247, 2017	25	2017
Loopitr: Combining dual and cross encoder architectures for image-text retrieval J Lei, X Chen, N Zhang, M Wang, M Bansal, TL Berg, L Yu arXiv preprint arXiv:2203.05465, 2022	14	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors