Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems

Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems Reinforcement learning, augmented by the representational power of deep neural networks, has shown promising results on high-dimensional problems, such as game playing and robotic control. However, the sequential nature of these problems poses a fundamental challenge for computational efficiency. Recently, alternative approaches such as evolutionary strategies and deep neuroevolution demonstrated competitive results with faster training time on distributed CPU cores. Here we report record training times (running at about 1 million frames per second) for Atari 2600 games using deep neuroevolution implemented on distributed FPGAs. Combined hardware implementation of the game console, image preprocessing and the neural network in an optimized pipeline, multiplied with the system level parallelism enabled the acceleration. These results are the first application demonstration on the IBM Neural Computer, which is a custom designed system that consists of 432 Xilinx FPGAs interconnected in a 3D mesh network topology. In addition to high performance, experiments also showed improvement in accuracy for all games compared to the CPU implementation of the same algorithm. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Journal on Emerging Technologies in Computing Systems (JETC) Association for Computing Machinery

Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems

Loading next page...
 
/lp/association-for-computing-machinery/accelerating-deep-neuroevolution-on-distributed-fpgas-for-X005CnjA2R
Publisher
Association for Computing Machinery
Copyright
Copyright © 2021 ACM
ISSN
1550-4832
eISSN
1550-4840
DOI
10.1145/3425500
Publisher site
See Article on Publisher Site

Abstract

Reinforcement learning, augmented by the representational power of deep neural networks, has shown promising results on high-dimensional problems, such as game playing and robotic control. However, the sequential nature of these problems poses a fundamental challenge for computational efficiency. Recently, alternative approaches such as evolutionary strategies and deep neuroevolution demonstrated competitive results with faster training time on distributed CPU cores. Here we report record training times (running at about 1 million frames per second) for Atari 2600 games using deep neuroevolution implemented on distributed FPGAs. Combined hardware implementation of the game console, image preprocessing and the neural network in an optimized pipeline, multiplied with the system level parallelism enabled the acceleration. These results are the first application demonstration on the IBM Neural Computer, which is a custom designed system that consists of 432 Xilinx FPGAs interconnected in a 3D mesh network topology. In addition to high performance, experiments also showed improvement in accuracy for all games compared to the CPU implementation of the same algorithm.

Journal

ACM Journal on Emerging Technologies in Computing Systems (JETC)Association for Computing Machinery

Published: Apr 5, 2021

Keywords: Genetic algorithm

References