Multi-modal dense video captioning V Iashin, E Rahtu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 173 | 2020 |
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer V Iashin, E Rahtu Proceedings of British Machine Vision Conference (BMVC), 2020 | 138 | 2020 |
Taming Visually Guided Sound Generation V Iashin, E Rahtu Proceedings of British Machine Vision Conference (BMVC), 2021 | 52 | 2021 |
Top-1 CORSMAL challenge 2020 submission: Filling mass estimation using multi-modal observations of human-robot handovers V Iashin, F Palermo, G Solak, C Coppola Pattern Recognition. ICPR International Workshops and Challenges: Virtual …, 2021 | 13 | 2021 |
Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors V Iashin, W Xie, E Rahtu, A Zisserman Proceedings of British Machine Vision Conference (BMVC), 2022 | 10 | 2022 |
The CORSMAL benchmark for the prediction of the properties of containers A Xompero, S Donaher, V Iashin, F Palermo, G Solak, C Coppola, ... IEEE Access 10, 41388-41402, 2022 | 7 | 2022 |
Synchformer: Efficient Synchronization from Sparse Cues V Iashin, W Xie, E Rahtu, A Zisserman arXiv preprint arXiv:2401.16423, 2024 | | 2024 |
Multi-modal Video Content Understanding V Iashin Tampere University, 2023 | | 2023 |