Authors
Wojciech Zaremba
Publication date
2015
Description
Abstract The Recurrent Neural Network (RNN) is an extremely powerful sequence model
that is often difficult to train. The Long Short-Term Memory (LSTM) is a specific RNN
architecture whose design makes it much easier to train. While wildly successful in practice,
the LSTM's architecture appears to be ad-hoc so it is not clear if it is optimal, and the
significance of its individual components is unclear.
that is often difficult to train. The Long Short-Term Memory (LSTM) is a specific RNN
architecture whose design makes it much easier to train. While wildly successful in practice,
the LSTM's architecture appears to be ad-hoc so it is not clear if it is optimal, and the
significance of its individual components is unclear.
Total citations
Scholar articles
Dates and citation counts are estimated and are determined automatically by a computer program.