Generic reinforcement learning codebase in TensorFlow

Authors: Li, Bryan; Cowen-Rivers, Alexander; Kozakowski, Piotr; Tao, David; Kamalakara, Siddhartha; Rajkumar, Nitarshan; Sezhiyan, Hariharan; +2 Authors

doi: 10.5281/zenodo.3382967 , 10.5281/zenodo.3408453 , 10.5281/zenodo.3382968

Generic reinforcement learning codebase in TensorFlow

- Summary
- Subjects
- Metrics

Abstract

Vast reinforcement learning (RL) research groups, such as DeepMind and OpenAI, have their internal (private) reinforcement learning codebases, which enable quick prototyping and comparing of ideas to many SOTA methods. We argue the five fundamental properties of a sophisticated research codebase are; modularity, reproducibility, many RL algorithms pre-implemented, speed and ease of running on different hardware/ integration with visualization packages. Currently, there does not exist any RL codebase, to the author's knowledge, which contains all the five properties, particularly with TensorBoard logging and abstracting away cloud hardware such as TPU's from the user. The codebase aims to help distil the best research practices into the community as well as ease the entry access and accelerate the pace of the field. More detailed documentation can be found here.

Keywords

TensorFlow, Reinforcement Learning

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility

views

3

3
views
Powered by

Found an issue? Give us feedback

visibility

0

Average

3

Green