We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm. At inference time, our method uses a critic ensemble to select the best action from proposals of multiple actors running in parallel. B... View more
1. Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
2. Anonymous. Distributional policy gradients. Under review at International Conference on Learning Representations, 2018.
3. Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Self-normalizing neural networks. arXiv preprint arXiv:1706.02515, 2017.
4. Thomas G Dietterich et al. Ensemble methods in machine learning. Multiple classifier systems, 1857:1-15, 2000.
5. Mikhail Pavlov and Sergey Kolesnikov and Sergey M. Plis. Run, skeleton, run: skeletal model in a physics-based simulation. arXiv preprint arXiv:1711.06922, 2017.