Prodiff: Progressive fast diffusion model for high-quality text-to-speech R Huang, Z Zhao, H Liu, J Liu, C Cui, Y Ren Proceedings of the 30th ACM International Conference on Multimedia, 2595-2605, 2022 | 108 | 2022 |
Multi-singer: Fast multi-singer singing voice vocoder with a large-scale corpus R Huang, F Chen, Y Ren, J Liu, C Cui, Z Zhao Proceedings of the 29th ACM International Conference on Multimedia, 3945-3954, 2021 | 69 | 2021 |
Generspeech: Towards style transfer for generalizable out-of-domain text-to-speech R Huang, Y Ren, J Liu, C Cui, Z Zhao Advances in Neural Information Processing Systems 35, 10970-10983, 2022 | 59 | 2022 |
Singgan: Generative adversarial network for high-fidelity singing voice generation R Huang, C Cui, F Chen, Y Ren, J Liu, Z Zhao, B Huai, Z Wang Proceedings of the 30th ACM International Conference on Multimedia, 2525-2535, 2022 | 48 | 2022 |
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model C Cui, Y Ren, J Liu, F Chen, R Huang, M Lei, Z Zhao Interspeech 2021, 2021 | 24 | 2021 |
Varietysound: Timbre-controllable video to sound generation via unsupervised information disentanglement C Cui, Z Zhao, Y Ren, J Liu, R Huang, F Chen, Z Wang, B Huai, F Wu ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 9 | 2023 |
Fastdiff 2: Revisiting and incorporating gans and diffusion models in high-fidelity speech synthesis R Huang, Y Ren, Z Jiang, C Cui, J Liu, Z Zhao Findings of the Association for Computational Linguistics: ACL 2023, 6994-7009, 2023 | 5 | 2023 |
Rmssinger: Realistic-music-score based singing voice synthesis J He, J Liu, Z Ye, R Huang, C Cui, H Liu, Z Zhao arXiv preprint arXiv:2305.10686, 2023 | 4 | 2023 |
UniSinger: Unified End-to-End Singing Voice Synthesis With Cross-Modality Information Matching Z Hong, C Cui, R Huang, L Zhang, J Liu, J He, Z Zhao Proceedings of the 31st ACM International Conference on Multimedia, 7569-7579, 2023 | 3 | 2023 |