publication . Preprint . 2017

Ensemble Sampling

Lu, Xiuyuan; Van Roy, Benjamin;
Open Access English
  • Published: 20 May 2017
Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applications for which Thompson sampling is viable. We establish a theoretical basis that supports the approach and present computational results that offer further insight.
free text keywords: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Learning
Download from

[1] Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. Weight uncertainty in neural networks. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML'15, pages 1613-1622., 2015.

[2] Olivier Chapelle and Lihong Li. An empirical evaluation of Thompson sampling. In J. ShaweTaylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 2249-2257. Curran Associates, Inc., 2011.

[3] Thomas G Dietterich. Ensemble learning. The handbook of brain theory and neural networks, 2:110-125, 2002.

[4] Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1050-1059, New York, New York, USA, 20-22 Jun 2016. PMLR.

[6] Ian Osband, Charles Blundell, Alexander Pritzel, and Benjamin Van Roy. Deep exploration via bootstrapped DQN. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 4026-4034. Curran Associates, Inc., 2016.

[7] W.R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4):285-294, 1933.

Any information missing or wrong?Report an Issue