Authors
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer
Publication date
2015
Conference
Advances in Neural Information Processing Systems
Pages
1171-1179
Description
Abstract Recurrent Neural Networks can be trained to produce sequences of tokens given
some input, as exemplified by recent results in machine translation and image captioning.
The current approach to training them consists of maximizing the likelihood of each token in
the sequence given the current (recurrent) state and the previous token. At inference, the
unknown previous token is then replaced by a token generated by the model itself. This
discrepancy between training and inference can yield errors that can accumulate quickly ...
Total citations
201520161234
Scholar articles
S Bengio, O Vinyals, N Jaitly, N Shazeer - Advances in Neural Information Processing Systems, 2015