Authors
William Chan, Navdeep Jaitly, Quoc V Le, Oriol Vinyals
Publication date
2016
Conference
International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Description
ABSTRACT We present Listen, Attend and Spell (LAS), a neural speech recognizer that
transcribes speech utterances directly to characters without pronunciation models, HMMs or
other components of traditional speech recognizers. In LAS, the neural network architecture
subsumes the acoustic, pronunciation and language models making it not only an end-to-
end trained system but an end-to-end model. In contrast to DNN-HMM, CTC and most other
models, LAS makes no independence assumptions about the probability distribution of ...
transcribes speech utterances directly to characters without pronunciation models, HMMs or
other components of traditional speech recognizers. In LAS, the neural network architecture
subsumes the acoustic, pronunciation and language models making it not only an end-to-
end trained system but an end-to-end model. In contrast to DNN-HMM, CTC and most other
models, LAS makes no independence assumptions about the probability distribution of ...
Total citations
20161
Scholar articles
W Chan, N Jaitly, Q Le, O Vinyals - 2016 IEEE International Conference on Acoustics, …, 2016
Dates and citation counts are estimated and are determined automatically by a computer program.