Antoine Yang

Cited by

	All	Since 2019
Citations	1152	1151
h-index	10	10
i10-index	10	10

560

280

140

420

2019202020212022202320244 42 82 115 365 541

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Cordelia SchmidResearch director INRIA Verified email at inria.fr
Josef SivicCzech Technical University, CIIRC, ELLIS Unit PragueVerified email at cvut.cz
Ivan LaptevVisiting professor at MBZUAI, on leave from INRIAVerified email at inria.fr
Antoine MiechGoogle DeepMindVerified email at google.com
Fabio Maria CarlucciMetaVerified email at meta.com
Pedro M EsperançaMachine Learning Engineer (London, UK)Verified email at huawei.com
Arsha NagraniResearch Scientist, GoogleVerified email at google.com

Antoine Yang

Google DeepMind

Verified email at google.com - Homepage

Computer Vision Machine Learning Deep Learning Vision and Language


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Just ask: Learning to answer questions from millions of narrated videos A Yang, A Miech, J Sivic, I Laptev, C Schmid Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	265	2021
NAS evaluation is frustratingly hard A Yang, PM Esperança, FM Carlucci International Conference on Learning Representations, 2020	199	2020
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024	196	2024
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models A Yang, A Miech, J Sivic, I Laptev, C Schmid Advances in Neural Information Processing Systems 35, 124-141, 2022	168	2022
Vid2seq: Large-scale pretraining of a visual language model for dense video captioning A Yang, A Nagrani, PH Seo, A Miech, J Pont-Tuset, I Laptev, J Sivic, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	142	2023
TubeDETR: Spatio-Temporal Video Grounding with Transformers A Yang, A Miech, J Sivic, I Laptev, C Schmid Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	82	2022
MANAS: multi-agent neural architecture search V Lopes, FM Carlucci, P Esperanca, M Singh, A Yang, V Gabillon, H Xu, ... Machine Learning, 1-24, 2023	31*	2023
Learning to Answer Visual Questions from Web Videos A Yang, A Miech, J Sivic, I Laptev, C Schmid IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022	28	2022
Covr: Learning composed video retrieval from web video captions L Ventura, A Yang, C Schmid, G Varol Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5270-5279, 2024	20	2024
VidChapters-7M: Video Chapters at Scale A Yang, A Nagrani, I Laptev, J Sivic, C Schmid Advances in Neural Information Processing Systems 36, 2023	13	2023
Just ask: Learning to answer questions from millions of narrated videos. 2021 IEEE A Yang, A Miech, J Sivic, I Laptev, C Schmid CVF International Conference on Computer Vision (ICCV), 1666-1677, 2020	8	2020
Learning Visual Language Models for Video Understanding A Yang Ecole Normale Superieure de Paris-ENS Paris, 2023		2023
VidChapters-7M: Video Chapters at Scale Supplementary Material A Yang, A Nagrani, I Laptev, J Sivic, C Schmid
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid
TubeDETR: Spatio-Temporal Video Grounding with Transformers Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid
Just Ask: Learning to Answer Questions from Millions of Narrated Videos Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid

The system can't perform the operation now. Try again later.

Articles 1–16

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors