Popmag: Pop music accompaniment generation Y Ren, J He, X Tan, T Qin, Z Zhao, TY Liu Proceedings of the 28th ACM international conference on multimedia, 1198-1206, 2020 | 106 | 2020 |
Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis Z Ye, Z Jiang, Y Ren, J Liu, J He, Z Zhao arXiv preprint arXiv:2301.13430, 2023 | 61 | 2023 |
M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus L Zhang, R Li, S Wang, L Deng, J Liu, Y Ren, J He, R Huang, J Zhu, ... Advances in Neural Information Processing Systems 35, 6914-6926, 2022 | 40 | 2022 |
Transpeech: Speech-to-speech translation with bilateral perturbation R Huang, J Liu, H Liu, Y Ren, L Zhang, J He, Z Zhao arXiv preprint arXiv:2205.12523, 2022 | 28 | 2022 |
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao arXiv preprint arXiv:2305.00787, 2023 | 12 | 2023 |
Flow-based unconstrained lip to speech generation J He, Z Zhao, Y Ren, J Liu, B Huai, N Yuan Proceedings of the AAAI Conference on Artificial Intelligence 36 (1), 843-851, 2022 | 11 | 2022 |
Clapspeech: Learning prosody from text context with contrastive language-audio pre-training Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao arXiv preprint arXiv:2305.10763, 2023 | 10 | 2023 |
Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts Z Jiang, J Liu, Y Ren, J He, C Zhang, Z Ye, P Wei, C Wang, X Yin, Z Ma, ... arXiv preprint arXiv:2307.07218, 2023 | 9 | 2023 |
Av-transpeech: Audio-visual robust speech-to-speech translation R Huang, H Liu, X Cheng, Y Ren, L Li, Z Ye, J He, L Zhang, J Liu, X Yin, ... arXiv preprint arXiv:2305.15403, 2023 | 6 | 2023 |
Rmssinger: Realistic-music-score based singing voice synthesis J He, J Liu, Z Ye, R Huang, C Cui, H Liu, Z Zhao arXiv preprint arXiv:2305.10686, 2023 | 4 | 2023 |
Real3d-portrait: One-shot realistic 3d talking portrait synthesis Z Ye, T Zhong, Y Ren, J Yang, W Li, J Huang, Z Jiang, J He, R Huang, ... arXiv preprint arXiv:2401.08503, 2024 | 3 | 2024 |
UniSinger: Unified End-to-End Singing Voice Synthesis With Cross-Modality Information Matching Z Hong, C Cui, R Huang, L Zhang, J Liu, J He, Z Zhao Proceedings of the 31st ACM International Conference on Multimedia, 7569-7579, 2023 | 3 | 2023 |
Vit-tts: visual text-to-speech with scalable diffusion transformer H Liu, R Huang, X Lin, W Xu, M Zheng, H Chen, J He, Z Zhao arXiv preprint arXiv:2305.12708, 2023 | 2 | 2023 |
Wav2sql: Direct generalizable speech-to-sql parsing H Liu, R Huang, J He, G Sun, R Shen, X Cheng, Z Zhao arXiv preprint arXiv:2305.12552, 2023 | 2 | 2023 |
DualSign: Semi-Supervised Sign Language Production with Balanced Multi-Modal Multi-Task Dual Transformation W Huang, Z Zhao, J He, M Zhang Proceedings of the 30th ACM International Conference on Multimedia, 5486-5495, 2022 | 2 | 2022 |
Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis Z Jiang, J Liu, Y Ren, J He, Z Ye, S Ji, Q Yang, C Zhang, P Wei, C Wang, ... The Twelfth International Conference on Learning Representations, 2023 | 1 | 2023 |
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis Y Zhang, R Huang, R Li, JZ He, Y Xia, F Chen, X Duan, B Huai, Z Zhao Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19597 …, 2024 | | 2024 |
PopMAG Y Ren, J He, X Tan, T Qin, Z Zhao, TY Liu Proceedings of the 28th ACM International Conference on Multimedia, 2020 | | 2020 |