Make-a-voice: Unified voice synthesis with discrete representation R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ... arXiv preprint arXiv:2305.19269, 2023 | 21 | 2023 |
Connecting multi-modal contrastive representations Z Wang, Y Zhao, H Huang, J Liu, A Yin, L Tang, L Li, Y Wang, Z Zhang, ... Advances in Neural Information Processing Systems 36, 22099-22114, 2023 | 12 | 2023 |
Fastlts: Non-autoregressive end-to-end unconstrained lip-to-speech synthesis Y Wang, Z Zhao Proceedings of the 30th ACM International Conference on Multimedia, 5678-5687, 2022 | 8 | 2022 |
Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer Y Wang, J Bai, R Huang, R Li, Z Hong, Z Zhao arXiv preprint arXiv:2309.07566, 2023 | 4 | 2023 |
Robust Singing Voice Transcription Serves Synthesis R Li, Y Zhang, Y Wang, Z Hong, R Huang, Z Zhao arXiv preprint arXiv:2405.09940, 2024 | 1 | 2024 |
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt Y Wang, R Hu, R Huang, Z Hong, R Li, W Liu, F You, T Jin, Z Zhao Proceedings of the 2024 Conference of the North American Chapter of the …, 2024 | | 2024 |
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion R Li, R Huang, Y Wang, Z Hong, Z Zhao arXiv preprint arXiv:2406.02429, 2024 | | 2024 |
Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching Y Wang, W Guo, R Huang, J Huang, Z Wang, F You, R Li, Z Zhao arXiv preprint arXiv:2406.00320, 2024 | | 2024 |
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment H Zhiqing, H Rongjie, C Xize, W Yongqi, L Ruiqi, Y Fuming, Z Zhou, ... arXiv preprint arXiv:2404.09313, 2024 | | 2024 |
InstructSpeech: Following Speech Editing Instructions via Large Language Models R Huang, R Hu, Y Wang, Z Wang, X Cheng, Z Jiang, Z Ye, D Yang, L Liu, ... Forty-first International Conference on Machine Learning, 0 | | |