Follow
Lorenzo Torresani
Lorenzo Torresani
Meta, Fundamental AI Research (FAIR)
Verified email at meta.com - Homepage
Title
Cited by
Year
Video ReCap: Recursive Captioning of Hour-Long Videos
MM Islam, N Ho, X Yang, T Nagarajan, L Torresani, G Bertasius
arXiv preprint arXiv:2402.13250, 2024
2024
Ht-step: Aligning instructional articles with how-to videos
T Afouras, E Mavroudi, T Nagarajan, H Wang, L Torresani
Advances in Neural Information Processing Systems 36, 2024
22024
Ego4d goal-step: Toward hierarchical understanding of procedural activities
Y Song, E Byrne, T Nagarajan, H Wang, M Martin, L Torresani
Advances in Neural Information Processing Systems 36, 2024
32024
Video ReCap: Recursive Captioning of Hour-Long Videos
M Mohaiminul Islam, N Ho, X Yang, T Nagarajan, L Torresani, G Bertasius
arXiv e-prints, arXiv: 2402.13250, 2024
2024
Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives
K Grauman, A Westbury, L Torresani, K Kitani, J Malik, T Afouras, ...
arXiv preprint arXiv:2311.18259, 2023
152023
Open-world instance segmentation: Top-down learning with bottom-up supervision
T Kalluri, W Wang, H Wang, M Chandraker, L Torresani, D Tran
32023
Multiscale video pretraining for long-term activity forecasting
R Tan, M De Lange, M Iuzzolino, BA Plummer, K Saenko, K Ridgeway, ...
arXiv preprint arXiv:2307.12854, 2023
32023
Anticipating future video based on present video
H Wang, A Miech, L Torresani
US Patent 11,636,681, 2023
2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
R Goyal, E Mavroudi, X Yang, S Sukhbaatar, L Sigal, M Feiszli, ...
arXiv preprint arXiv:2302.08063, 2023
12023
Egocentric video task translation@ ego4d challenge 2022
Z Xue, Y Song, K Grauman, L Torresani
arXiv preprint arXiv:2302.01891, 2023
32023
What you say is what you show: Visual narration detection in instructional videos
K Ashutosh, R Girdhar, L Torresani, K Grauman
arXiv preprint arXiv:2301.02307, 2023
42023
Ego-only: Egocentric action detection without exocentric transferring
H Wang, MK Singh, L Torresani
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
52023
Histoperm: A permutation-based view generation approach for improving histopathologic feature representation learning
J DiPalma, L Torresani, S Hassanpour
Journal of Pathology Informatics 14, 100320, 2023
22023
Learning to ground instructional articles in videos through narrations
E Mavroudi, T Afouras, L Torresani
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
62023
Relational space-time query in long-form videos
X Yang, FJ Chu, M Feiszli, R Goyal, L Torresani, D Tran
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
52023
Hiervl: Learning hierarchical video-language embeddings
K Ashutosh, R Girdhar, L Torresani, K Grauman
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
172023
Ego-only: Egocentric action detection without exocentric pretraining
H Wang, MK Singh, L Torresani
arXiv preprint arXiv:2301.01380 2, 2023
62023
Egocentric video task translation
Z Xue, Y Song, K Grauman, L Torresani
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
42023
Classifying a video stream using a self-attention-based machine-learning model
G Bertasius, H Wang, L Torresani
US Patent App. 17/461,755, 2022
2022
Task-Specific Text Generation Based On Multimodal Inputs
LIN Xudong, G Bertasius, J Wang, DN Parikh, L Torresani
US Patent App. 17/339,759, 2022
2022
Label hallucination for few-shot classification
Y Jian, L Torresani
Proceedings of the AAAI Conference on Artificial Intelligence 36 (6), 7005-7014, 2022
332022
Calibrating Histopathology Image Classifiers Using Label Smoothing
J Wei, L Torresani, J Wei, S Hassanpour
International Conference on Artificial Intelligence in Medicine, 273-282, 2022
32022
HistoPerm: A Permutation-Based View Generation Approach for Learning Histopathologic Feature Representations.
J DiPalma, L Torresani, S Hassanpour
CoRR, 2022
2022
Deformable video transformer
J Wang, L Torresani
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
342022
Learning to recognize procedural activities with distant supervision
X Lin, F Petroni, G Bertasius, M Rohrbach, SF Chang, L Torresani
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
492022
Ego4d: Around the world in 3,000 hours of egocentric video
K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
5432022
Long-short temporal contrastive learning of video transformers
J Wang, G Bertasius, D Tran, L Torresani
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
452022
Generalized few-shot video classification with video retrieval and feature generation
Y Xian, B Korbar, M Douze, L Torresani, B Schiele, Z Akata
IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (12), 8949 …, 2021
122021
Resolution-based distillation for efficient histology image classification
J DiPalma, AA Suriawinata, LJ Tafe, L Torresani, S Hassanpour
Artificial Intelligence in Medicine 119, 102136, 2021
262021
Is space-time attention all you need for video understanding?
G Bertasius, H Wang, L Torresani
ICML 2 (3), 4, 2021
16322021
Slot machines: Discovering winning combinations of random weights in neural networks
MM Aladago, L Torresani
International Conference on Machine Learning, 163-174, 2021
102021
A multi-view approach to audio-visual speaker verification
L Sarı, K Singh, J Zhou, L Torresani, N Singhal, Y Saraf
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
422021
Convolutional neural network based on groupwise convolution for efficient video analysis
K He, H Wang, MD Feiszli, L Torresani
US Patent 10,984,245, 2021
42021
Beyond short clips: End-to-end video-level learning with collaborative memories
X Yang, H Fan, L Torresani, LS Davis, H Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
182021
A petri dish for histopathology image analysis
J Wei, A Suriawinata, B Ren, X Liu, M Lisovsky, L Vaickus, C Brown, ...
Artificial Intelligence in Medicine: 19th International Conference on …, 2021
422021
Vx2text: End-to-end learning of video-based text generation from multimodal inputs
X Lin, G Bertasius, J Wang, SF Chang, D Parikh, L Torresani
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
602021
Supervoxel attention graphs for long-range video modeling
Y Wang, G Bertasius, TH Oh, A Gupta, M Hoai, L Torresani
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2021
52021
Learn like a pathologist: curriculum learning by annotator agreement for histopathology image classification
J Wei, A Suriawinata, B Ren, X Liu, M Lisovsky, L Vaickus, C Brown, ...
Proceedings of the IEEE/CVF winter conference on applications of computer …, 2021
532021
Only time can tell: Discovering temporal data for temporal modeling
L Sevilla-Lara, S Zha, Z Yan, V Goswami, M Feiszli, L Torresani
Proceedings of the IEEE/CVF winter conference on applications of computer …, 2021
772021
Video understanding as machine translation
B Korbar, F Petroni, R Girdhar, L Torresani
arXiv preprint arXiv:2006.07203, 2020
282020
Stein variational inference for discrete distributions
J Han, F Ding, X Liu, L Torresani, J Peng, Q Liu
International Conference on Artificial Intelligence and Statistics, 4563-4572, 2020
242020
Task meta-transfer from limited parallel labels
Y Jian, K Ahmed, L Torresani
Meta-Learning workshop, NeurIPS, 2020
22020
Cobe: Contextualized object embeddings from narrated instructional video
G Bertasius, L Torresani
Advances in Neural Information Processing Systems 33, 15133-15145, 2020
212020
Generalized many-way few-shot video classification
Y Xian, B Korbar, M Douze, B Schiele, Z Akata, L Torresani
Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020 …, 2020
192020
Listen to look: Action recognition by previewing audio
R Gao, TH Oh, K Grauman, L Torresani
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
2492020
Classifying, segmenting, and tracking object instances in video with mask propagation
G Bertasius, L Torresani
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020
1852020
Self-supervised learning by cross-modal audio-video clustering
H Alwassel, D Mahajan, B Korbar, L Torresani, B Ghanem, D Tran
Advances in Neural Information Processing Systems 33, 9758-9770, 2020
4472020
Video modeling with correlation networks
H Wang, D Tran, L Torresani, M Feiszli
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
1512020
Semantic Segmentation of The Growth Stages of Plasmodium Parasites Using Convolutional Neural Networks
MM Aladago, L Torresani, EV Rosca
2019 IEEE AFRICON, 1-7, 2019
12019
Unidual: A unified model for image and video understanding
Y Wang, D Tran, L Torresani
arXiv preprint arXiv:1906.03857, 2019
22019
Attentive action and context factorization
Y Wang, V Tran, G Bertasius, L Torresani, M Hoai
arXiv preprint arXiv:1904.05410, 2019
62019
Star-caps: Capsule networks with straight-through attentive routing
K Ahmed, L Torresani
Advances in neural information processing systems 32, 2019
702019
Hacs: Human action clips and segments dataset for recognition and temporal localization
H Zhao, A Torralba, L Torresani, Z Yan
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
2652019
Leveraging the present to anticipate the future in videos
A Miech, I Laptev, J Sivic, H Wang, L Torresani, D Tran
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019
742019
Learning temporal pose estimation from sparsely-labeled videos
G Bertasius, C Feichtenhofer, D Tran, J Shi, L Torresani
Advances in neural information processing systems 32, 2019
812019
Video classification with channel-separated convolutional networks
D Tran, H Wang, L Torresani, M Feiszli
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
6322019
Scsampler: Sampling salient clips from video for efficient action recognition
B Korbar, D Tran, L Torresani
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
2382019
Distinit: Learning video representations without a single labeled video
R Girdhar, D Tran, L Torresani, D Ramanan
Proceedings of the IEEE/CVF International Conference on Computer Vision, 852-861, 2019
712019
Learning discriminative motion features through detection
G Bertasius, C Feichtenhofer, D Tran, J Shi, L Torresani
arXiv preprint arXiv:1812.04172, 2018
162018
Branchconnect: Image categorization with learned branch connections
K Ahmed, L Torresani
2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 1244-1253, 2018
102018
Cooperative learning of audio and video models from self-supervised synchronization
B Korbar, D Tran, L Torresani
Advances in Neural Information Processing Systems 31, 2018
5002018
Self-supervised feature learning for semantic segmentation of overhead imagery
S Singh, A Batra, G PANG, L Torresani, S Basu, M Paluri, CV Jawahar
BMVA Press, 2018
752018
Scenes-objects-actions: A multi-task, multi-label video dataset
J Ray, H Wang, D Tran, Y Wang, M Feiszli, L Torresani, M Paluri
Proceedings of the European Conference on Computer Vision (ECCV), 635-651, 2018
342018
Maskconnect: Connectivity learning by gradient descent
K Ahmed, L Torresani
Proceedings of the European Conference on Computer Vision (ECCV), 349-365, 2018
692018
What makes a video a video: Analyzing temporal information in video understanding models and datasets
DA Huang, V Ramanathan, D Mahajan, L Torresani, M Paluri, L Fei-Fei, ...
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2018
1512018
Object detection in video with spatiotemporal sampling networks
G Bertasius, L Torresani, J Shi
Proceedings of the European Conference on Computer Vision (ECCV), 331-346, 2018
2682018
Detect-and-track: Efficient pose estimation in videos
R Girdhar, G Gkioxari, L Torresani, M Paluri, D Tran
Proceedings of the IEEE conference on computer vision and pattern …, 2018
2862018
A closer look at spatiotemporal convolutions for action recognition
D Tran, H Wang, L Torresani, J Ray, Y LeCun, M Paluri
Proceedings of the IEEE conference on Computer Vision and Pattern …, 2018
32212018
SLAC: A sparsely labeled dataset for action classification and localization
H Zhao, Z Yan, H Wang, L Torresani, A Torralba
arXiv preprint arXiv:1712.09374 2, 3, 2017
342017
Multiple hypothesis colorization and its application to image compression
MH Baig, L Torresani
Computer Vision and Image Understanding 164, 111-123, 2017
312017
Connectivity learning in multi-branch networks
K Ahmed, L Torresani
arXiv preprint arXiv:1709.09582, 2017
422017
Learning to Inpaint for Image Compression
M Haris Baig, V Koltun, L Torresani
arXiv e-prints, arXiv: 1709.08855, 2017
2017
Techniques for enabling or establishing the use of face recognition algorithms
SB Gokturk, D Anguelov, L Torresani, VO Vanhoucke, M Shah, DT Vu, ...
US Patent 9,690,979, 2017
172017
Local Perturb-and-MAP for structured prediction
G Bertasius, Q Liu, L Torresani, J Shi
Artificial Intelligence and Statistics, 585-594, 2017
32017
Deciphering severely degraded license plates
S Agarwal, D Tran, L Torresani, H Farid
Electronic Imaging 29, 138-143, 2017
172017
Simple, efficient and effective keypoint tracking
R Girdhar, G Gkioxari, L Torresani, D Ramanan, M Paluri, D Tran
ICCV PoseTrack Workshop, 2017
32017
Learning to inpaint for image compression
MH Baig, V Koltun, L Torresani
Advances in Neural Information Processing Systems 30, 2017
652017
Looking under the hood: Deep neural network visualization to interpret whole-slide image analysis outcomes for colorectal polyps
B Korbar, AM Olofson, AP Miraflor, CM Nicka, MA Suriawinata, ...
Proceedings of the IEEE conference on computer vision and pattern …, 2017
532017
Deep learning for classification of colorectal polyps on whole-slide images
B Korbar, AM Olofson, AP Miraflor, CM Nicka, MA Suriawinata, ...
Journal of pathology informatics 8 (1), 30, 2017
3072017
Convolutional random walk networks for semantic image segmentation
G Bertasius, L Torresani, SX Yu, J Shi
Proceedings of the IEEE conference on computer vision and pattern …, 2017
1602017
EXMOVES: mid-level features for efficient action recognition and video analysis
D Tran, L Torresani
International Journal of Computer Vision 119, 239-253, 2016
162016
VideoMCC: a New benchmark for video comprehension
D Tran, M Bolonkin, M Paluri, L Torresani
arXiv preprint arXiv:1606.07373, 2016
42016
Multiple Hypothesis Colorization
MH Baig, L Torresani
arXiv preprint arXiv:1606.06314, 2016
2016
Multiple Hypothesis Colorization
M Haris Baig, L Torresani
arXiv e-prints, arXiv: 1606.06314, 2016
2016
Recurrent mixture density network for spatiotemporal visual attention
L Bazzani, H Larochelle, L Torresani
arXiv preprint arXiv:1603.08199, 2016
1542016
Coupled depth learning
MH Baig, L Torresani
2016 IEEE Winter Conference on Applications of Computer Vision (WACV), 1-10, 2016
402016
Self-taught object localization with deep networks
L Bazzani, A Bergamo, D Anguelov, L Torresani
2016 IEEE winter conference on applications of computer vision (WACV), 1-9, 2016
1992016
Colorization for image compression
MH Baig, L Torresani
CoRR abs/1606.06314, 2016
42016
Cross-stitch networks for multi-task learning
I Misra, A Shrivastava, A Gupta, M Hebert
Proceedings of the IEEE conference on computer vision and pattern …, 2016
2016
Network of experts for large-scale image categorization
K Ahmed, MH Baig, L Torresani
Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016
1372016
Deep end2end voxel2voxel prediction
D Tran, L Bourdev, R Fergus, L Torresani, M Paluri
Proceedings of the IEEE conference on computer vision and pattern …, 2016
1502016
Semantic segmentation with boundary neural fields
G Bertasius, J Shi, L Torresani
Proceedings of the IEEE conference on computer vision and pattern …, 2016
2272016
System and method for enabling image searching using manual enrichment, classification, and/or segmentation
SB Gokturk, B Sumengen, D Vu, N Dalal, D Yang, X Lin, A Khan, M Shah, ...
US Patent 9,082,162, 2015
492015
System and method for search portions of objects in images and features thereof
SB Gokturk, B Sumengen, D Vu, N Dalal, D Yang, X Lin, A Khan, M Shah, ...
US Patent 9,008,435, 2015
712015
System and method for search portions of objects in images and features thereof
SB Gokturk, B Sumengen, D Vu, N Dalal, D Yang, X Lin, A Khan, M Shah, ...
US Patent 9,008,435, 2015
712015
System and method for search portions of objects in images and features thereof
SB Gokturk, B Sumengen, D Vu, N Dalal, D Yang, X Lin, A Khan, M Shah, ...
US Patent 9,008,435, 2015
712015
Coarse-to-fine depth estimation from a single image via coupled regression and dictionary learning
MH Baig, L Torresani
arXiv 1501, 5, 2015
82015
Coupled Depth Learning
M Haris Baig, L Torresani
arXiv e-prints, arXiv: 1501.04537, 2015
2015
Learning spatiotemporal features with 3d convolutional networks
D Tran, L Bourdev, R Fergus, L Torresani, M Paluri
Proceedings of the IEEE international conference on computer vision, 4489-4497, 2015
97362015
High-for-low and low-for-high: Efficient boundary detection from deep object features and its applications to high-level vision
G Bertasius, J Shi, L Torresani
Proceedings of the IEEE international conference on computer vision, 504-512, 2015
2052015
The system can't perform the operation now. Try again later.
Articles 1–100