Authors
William Chan, Navdeep Jaitly, Quoc V Le, Oriol Vinyals
Publication date
2015
Conference
arXiv
Description
Abstract: We present Listen, Attend and Spell (LAS), a neural network that learns to
transcribe speech utterances to characters. Unlike traditional DNN-HMM models, this model
learns all the components of a speech recognizer jointly. Our system has two components: a
listener and a speller. The listener is a pyramidal recurrent network encoder that accepts
filter bank spectra as inputs. The speller is an attention-based recurrent network decoder
that emits characters as outputs. The network produces character sequences without ...
Total citations
201520161820
Scholar articles
W Chan, N Jaitly, QV Le, O Vinyals - arXiv preprint arXiv:1508.01211, 2015