Espnet: End-to-end speech processing toolkit S Watanabe, T Hori, S Karita, T Hayashi, J Nishitoba, Y Unno, NEY Soplin, ... arXiv preprint arXiv:1804.00015, 2018 | 1467 | 2018 |
A comparative study on transformer vs rnn in speech applications S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ... 2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019 | 767 | 2019 |
WaveGrad: Estimating gradients for waveform generation N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi, W Chan International Conference on Learning Representations, 2021 | 596 | 2021 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 568 | 2023 |
Deep feature for text-dependent speaker verification Y Liu, Y Qian, N Chen, T Fu, Y Zhang, K Yu Speech Communication 73, 1-13, 2015 | 209 | 2015 |
Zero-shot multi-speaker text-to-speech with state-of-the-art neural speaker embeddings E Cooper, CI Lai, Y Yasuda, F Fang, X Wang, N Chen, J Yamagishi ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 181 | 2020 |
ASSERT: Anti-spoofing with squeeze-excitation and residual networks CI Lai, N Chen, J Villalba, N Dehak arXiv preprint arXiv:1904.01120, 2019 | 169 | 2019 |
Google usm: Scaling automatic speech recognition beyond 100 languages Y Zhang, W Han, J Qin, Y Wang, A Bapna, Z Chen, N Chen, B Li, ... arXiv preprint arXiv:2303.01037, 2023 | 132 | 2023 |
x-vectors meet emotions: A study on dependencies between emotion and speaker recognition R Pappagari, T Wang, J Villalba, N Chen, N Dehak ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 132 | 2020 |
State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and speakers in the wild evaluations J Villalba, N Chen, D Snyder, D Garcia-Romero, A McCree, G Sell, ... Computer Speech & Language 60, 101026, 2020 | 132 | 2020 |
Non-autoregressive transformer for speech recognition N Chen, S Watanabe, J Villalba, P Żelasko, N Dehak IEEE Signal Processing Letters 28, 121-125, 2020 | 125 | 2020 |
Mask CTC: Non-autoregressive end-to-end ASR with CTC and mask predict Y Higuchi, S Watanabe, N Chen, T Ogawa, T Kobayashi arXiv preprint arXiv:2005.08700, 2020 | 123 | 2020 |
Multi-task learning for text-dependent speaker verification N Chen, Y Qian, K Yu Proc. 16th Annual Conference of the International Speech Communication …, 2015 | 120 | 2015 |
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. J Villalba, N Chen, D Snyder, D Garcia-Romero, A McCree, G Sell, ... Interspeech, 1488-1492, 2019 | 117 | 2019 |
Robust deep feature for spoofing detection—The SJTU system for ASVspoof 2015 challenge N Chen, Y Qian, H Dinkel, B Chen, K Yu Sixteenth annual conference of the international speech communication …, 2015 | 101 | 2015 |
Overview of BTAS 2016 speaker anti-spoofing competition P Korshunov, S Marcel, H Muckenhirn, AR Gonçalves, AGS Mello, ... 2016 IEEE 8th international conference on biometrics theory, applications …, 2016 | 100 | 2016 |
Age estimation in short speech utterances based on LSTM recurrent neural networks R Zazo, PS Nidadavolu, N Chen, J Gonzalez-Rodriguez, N Dehak IEEE Access 6, 22524-22530, 2018 | 98 | 2018 |
Noise2music: Text-conditioned music generation with diffusion models Q Huang, DS Park, T Wang, TI Denk, A Ly, N Chen, Z Zhang, Z Zhang, ... arXiv preprint arXiv:2302.03917, 2023 | 93 | 2023 |
End-to-end spoofing detection with raw waveform CLDNNS H Dinkel, N Chen, Y Qian, K Yu 2017 IEEE international conference on acoustics, speech and signal …, 2017 | 84 | 2017 |
Deep features for automatic spoofing detection Y Qian, N Chen, K Yu Speech Communication 85, 43-52, 2016 | 79 | 2016 |