Human-level control through deep reinforcement learning

V Mnih, K Kavukcuoglu, D Silver, AA Rusu, J Veness… - Nature, 2015 - nature.com
The theory of reinforcement learning provides a normative account 1, deeply rooted in
psychological 2 and neuroscientific 3 perspectives on animal behaviour, of how agents may
optimize their control of an environment. To use reinforcement learning successfully in ...

The arcade learning environment: An evaluation platform for general agents

MG Bellemare, Y Naddaf, J Veness… - Journal of Artificial …, 2012 - arxiv.org
Abstract: In this article we introduce the Arcade Learning Environment (ALE): both a
challenge problem and a platform and methodology for evaluating the development of
general, domain-independent AI technology. ALE provides an interface to hundreds of ...

Playing atari with deep reinforcement learning

V Mnih, K Kavukcuoglu, D Silver, A Graves… - arXiv preprint arXiv: …, 2013 - arxiv.org
Abstract: We present the first deep learning model to successfully learn control policies
directly from high-dimensional sensory input using reinforcement learning. The model is a
convolutional neural network, trained with a variant of Q-learning, whose input is raw ...

Deep learning for real-time Atari game play using offline Monte-Carlo tree search planning

X Guo, S Singh, H Lee, RL Lewis… - Advances in neural …, 2014 - papers.nips.cc
Abstract The combination of modern Reinforcement Learning and Deep Learning
approaches holds the promise of making significant progress on challenging applications
requiring both rich perception and policy-selection. The Arcade Learning Environment ( ...

[PDF][PDF] End-to-end training of deep visuomotor policies

S Levine, C Finn, T Darrell, P Abbeel - Journal of Machine Learning …, 2016 - jmlr.org
Abstract Policy search methods can allow robots to learn control policies for a wide range of
tasks, but practical applications of policy search often require hand-engineered components
for perception, state estimation, and low-level control. In this paper, we aim to answer the ...

Mastering the game of Go with deep neural networks and tree search

D Silver, A Huang, CJ Maddison, A Guez, L Sifre… - Nature, 2016 - nature.com
The game of Go has long been viewed as the most challenging of classic games for artificial
intelligence owing to its enormous search space and the difficulty of evaluating board
positions and moves. Here we introduce a new approach to computer Go that uses 'value ...

[PDF][PDF] Deep reinforcement learning with double Q-learning

H Van Hasselt, A Guez, D Silver - CoRR, abs/1509.06461, 2015 - aaai.org
Abstract The popular Q-learning algorithm is known to overestimate action values under
certain conditions. It was not previously known whether, in practice, such overestimations
are common, whether they harm performance, and whether they can generally be ...

Continuous control with deep reinforcement learning

TP Lillicrap, JJ Hunt, A Pritzel, N Heess, T Erez… - arXiv preprint arXiv: …, 2015 - arxiv.org
Abstract: We adapt the ideas underlying the success of Deep Q-Learning to the continuous
action domain. We present an actor-critic, model-free algorithm based on the deterministic
policy gradient that can operate over continuous action spaces. Using the same learning ...

Prioritized experience replay

T Schaul, J Quan, I Antonoglou, D Silver - arXiv preprint arXiv:1511.05952, 2015 - arxiv.org
Abstract: Experience replay lets online reinforcement learning agents remember and reuse
experiences from the past. In prior work, experience transitions were uniformly sampled from
a replay memory. However, this approach simply replays transitions at the same ...

Deep learning

Y LeCun, Y Bengio, G Hinton - Nature, 2015 - nature.com
Deep learning allows computational models that are composed of multiple processing
layers to learn representations of data with multiple levels of abstraction. These methods
have dramatically improved the state-of-the-art in speech recognition, visual object ...