Follow
Andrew Rouditchenko
Andrew Rouditchenko
PhD Student at MIT CSAIL
Verified email at mit.edu - Homepage
Title
Cited by
Cited by
Year
The sound of pixels
H Zhao, C Gan, A Rouditchenko, C Vondrick, J McDermott, A Torralba
Proceedings of the European conference on computer vision (ECCV), 570-586, 2018
5382018
Avlnet: Learning audio-visual language representations from instructional videos
A Rouditchenko, A Boggust, D Harwath, B Chen, D Joshi, S Thomas, ...
Proc. Interspeech 2021, 1584-1588, 2021
1302021
Self-supervised audio-visual co-segmentation
A Rouditchenko, H Zhao, C Gan, J McDermott, A Torralba
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1182019
Everything at once-multi-modal fusion transformer for video retrieval
N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, RS Feris, ...
Proceedings of the ieee/cvf conference on computer vision and pattern …, 2022
1162022
Multimodal clustering networks for self-supervised learning from unlabeled videos
B Chen, A Rouditchenko, K Duarte, H Kuehne, S Thomas, A Boggust, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
692021
Contrastive audio-visual masked autoencoder
Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ...
arXiv preprint arXiv:2210.07839, 2022
642022
Cross-modal discrete representation learning
AH Liu, SY Jin, CIJ Lai, A Rouditchenko, A Oliva, J Glass
arXiv preprint arXiv:2106.05438, 2021
352021
Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification
Y Gong, S Khurana, A Rouditchenko, J Glass
arXiv preprint arXiv:2203.06760, 2022
252022
Uavm: Towards unifying audio and visual models
Y Gong, AH Liu, A Rouditchenko, J Glass
IEEE Signal Processing Letters 29, 2437-2441, 2022
112022
Cascaded Multilingual Audio-Visual Learning from Videos
A Rouditchenko, A Boggust, D Harwath, S Thomas, H Kuehne, B Chen, ...
Proc. Interspeech 2021, 3006-3010, 2021
62021
Label-efficient audio classification through multitask learning and self-supervision
T Lee, T Gong, S Padhy, A Rouditchenko, A Ndirango
arXiv preprint arXiv:1910.12587, 2019
62019
Comparison of multilingual self-supervised and weakly-supervised speech pre-training for adaptation to unseen languages
A Rouditchenko, S Khurana, S Thomas, R Feris, L Karlinsky, H Kuehne, ...
arXiv preprint arXiv:2305.12606, 2023
42023
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
I Palmer, A Rouditchenko, A Barbu, B Katz, J Glass
Proc. Interspeech 2021, 3650-3654, 2021
42021
C2kd: Cross-lingual cross-modal knowledge distillation for multilingual text-video retrieval
A Rouditchenko, YS Chuang, N Shvetsova, S Thomas, R Feris, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Routing with self-attention for multimodal capsule networks
K Duarte, B Chen, N Shvetsova, A Rouditchenko, S Thomas, A Liu, ...
arXiv preprint arXiv:2112.00775, 2021
32021
What, when, and where?--Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
B Chen, N Shvetsova, A Rouditchenko, D Kondermann, S Thomas, ...
arXiv preprint arXiv:2303.16990, 2023
22023
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
A Rouditchenko, R Collobert, T Likhomanenko
arXiv preprint arXiv:2309.17395, 2023
2023
Learning Audio-Video Language Representations
A Rouditchenko
Massachusetts Institute of Technology, 2021
2021
Everything at Once–Multi-modal Fusion Transformer for Video Retrieval Supplementary Material
N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, R Feris, ...
The system can't perform the operation now. Try again later.
Articles 1–19