Authors
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis
Publication date
2016/1/28
Journal
Nature
Volume
529
Issue
7587
Pages
484-489
Publisher
Nature Publishing Group
Description
The game of Go has long been viewed as the most challenging of classic games for artificial
intelligence owing to its enormous search space and the difficulty of evaluating board
positions and moves. Here we introduce a new approach to computer Go that uses 'value
networks' to evaluate board positions and 'policy networks' to select moves. These deep
neural networks are trained by a novel combination of supervised learning from human
expert games, and reinforcement learning from games of self-play. Without any lookahead ...
intelligence owing to its enormous search space and the difficulty of evaluating board
positions and moves. Here we introduce a new approach to computer Go that uses 'value
networks' to evaluate board positions and 'policy networks' to select moves. These deep
neural networks are trained by a novel combination of supervised learning from human
expert games, and reinforcement learning from games of self-play. Without any lookahead ...
Total citations
Scholar articles
D Silver, A Huang, CJ Maddison, A Guez, L Sifre… - Nature, 2016
Dates and citation counts are estimated and are determined automatically by a computer program.