Authors
Quoc V Le, Navdeep Jaitly, Geoffrey E Hinton
Publication date
2015/4/3
Journal
arXiv preprint arXiv:1504.00941
Description
Abstract: Learning long term dependencies in recurrent networks is difficult due to vanishing
and exploding gradients. To overcome this difficulty, researchers have developed
sophisticated optimization techniques and network architectures. In this paper, we propose a
simpler solution that use recurrent neural networks composed of rectified linear units. Key to
our solution is the use of the identity matrix or its scaled version to initialize the recurrent
weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: ...
and exploding gradients. To overcome this difficulty, researchers have developed
sophisticated optimization techniques and network architectures. In this paper, we propose a
simpler solution that use recurrent neural networks composed of rectified linear units. Key to
our solution is the use of the identity matrix or its scaled version to initialize the recurrent
weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: ...
Total citations
Scholar articles
QV Le, N Jaitly, GE Hinton - arXiv preprint arXiv:1504.00941, 2015
Dates and citation counts are estimated and are determined automatically by a computer program.