publication . Preprint . 2014

A Bayesian encourages dropout

Maeda, Shin-ichi;
Open Access English
  • Published: 22 Dec 2014
Abstract
Dropout is one of the key techniques to prevent the learning from overfitting. It is explained that dropout works as a kind of modified L2 regularization. Here, we shed light on the dropout from Bayesian standpoint. Bayesian interpretation enables us to optimize the dropout rate, which is beneficial for learning of weight parameters and prediction after learning. The experiment result also encourages the optimization of the dropout.
Subjects
free text keywords: Computer Science - Learning, Computer Science - Neural and Evolutionary Computing, Statistics - Machine Learning
Download from

Ba, Jimmy and Frey, Brendan. Adaptive dropout for training deep neural networks. In Advances in Neural Information Processing Systems, pp. 3084-3092, 2013.

Baldi, Pierre and Sadowski, Peter J. Understanding dropout. In Advances in Neural Information Processing Systems, pp. 2814-2822, 2013.

Bishop, Christopher M. Pattern recognition and machine learning, volume 1. springer New York, 2006.

Graham, Benjamin. arXiv:1409.6070, 2014.

arXiv preprint Hinton, Geoffrey E, Srivastava, Nitish, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.

Neal, Radford M. Bayesian learning for neural networks. Springer-Verlag New York, Inc., 1996. [OpenAIRE]

Wager, Stefan, Wang, Sida, and Liang, Percy. Dropout training as adaptive regularization. In Advances in Neural Information Processing Systems, pp. 351-359, 2013.

Wan, Li, Zeiler, Matthew, Zhang, Sixin, Cun, Yann L, and Fergus, Rob. Regularization of neural networks using dropconnect. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1058-1066, 2013.

Wang, Sida and Manning, Christopher. Fast dropout training. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 118-126, 2013.

Watanabe, Sumio. Algebraic geometry and statistical learning theory, volume 25. Cambridge University Press, 2009. [OpenAIRE]

Xiong, Hui Yuan, Barash, Yoseph, and Frey, Brendan J. Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. Bioinformatics, 27(18):2554-2562, 2011.

Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue