Authors
Wojciech Zaremba
Publication date
2015
Description
Abstract The Recurrent Neural Network (RNN) is an extremely powerful sequence model
that is often difficult to train. The Long Short-Term Memory (LSTM) is a specific RNN
architecture whose design makes it much easier to train. While wildly successful in practice,
the LSTM's architecture appears to be ad-hoc so it is not clear if it is optimal, and the
significance of its individual components is unclear.
Total citations
201520161367