Joint Training of Deep Boltzmann Machines

Preprint English OPEN
Goodfellow, Ian; Courville, Aaron; Bengio, Yoshua;
(2012)
  • Subject: Statistics - Machine Learning | Computer Science - Learning
    acm: ComputingMethodologies_PATTERNRECOGNITION

We introduce a new method for training deep Boltzmann machines jointly. Prior methods require an initial learning pass that trains the deep Boltzmann machine greedily, one layer at a time, or do not perform well on classifi- cation tasks.
  • References (7)

    Arnold, L. and Ollivier, Y. (2012). Layer-wise learning of deep generative models. ArXiv e-prints.

    Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580.

    Montavon, G. and Mu¨ller, K.-R. (2012). Learning feature hierarchies with cented deep Boltzmann machines. CoRR, abs/1203.4416.

    Salakhutdinov, R. and Hinton, G. (2009). Deep Boltzmann machines. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), volume 8.

    Stoyanov, V., Ropson, A., and Eisner, J. (2011). Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), volume 15 of JMLR Workshop and Conference Proceedings, pages 725-733, Fort Lauderdale. Supplementary material (4 pages) also available.

    Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In W. W. Cohen, A. McCallum, and S. T. Roweis, editors, ICML 2008 , pages 1064-1071. ACM.

    Younes, L. (1999). On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics and Stochastic Reports, 65(3), 177-228.

  • Metrics
    No metrics available
Share - Bookmark