Arnold, L. and Ollivier, Y. (2012). Layer-wise learning of deep generative models. ArXiv e-prints.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580.
Montavon, G. and Mu¨ller, K.-R. (2012). Learning feature hierarchies with cented deep Boltzmann machines. CoRR, abs/1203.4416.
Salakhutdinov, R. and Hinton, G. (2009). Deep Boltzmann machines. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), volume 8.
Stoyanov, V., Ropson, A., and Eisner, J. (2011). Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), volume 15 of JMLR Workshop and Conference Proceedings, pages 725-733, Fort Lauderdale. Supplementary material (4 pages) also available.
Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In W. W. Cohen, A. McCallum, and S. T. Roweis, editors, ICML 2008 , pages 1064-1071. ACM. [OpenAIRE]
Younes, L. (1999). On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics and Stochastic Reports, 65(3), 177-228.