publication . Preprint . 2015

Soft-Deep Boltzmann Machines

Kiwaki, Taichi;
Open Access English
  • Published: 10 May 2015
Abstract
We present a layered Boltzmann machine (BM) that can better exploit the advantages of a distributed representation. It is widely believed that deep BMs (DBMs) have far greater representational power than its shallow counterpart, restricted Boltzmann machines (RBMs). However, this expectation on the supremacy of DBMs over RBMs has not ever been validated in a theoretical fashion. In this paper, we provide both theoretical and empirical evidences that the representational power of DBMs can be actually rather limited in taking advantages of distributed representations. We propose an approximate measure for the representational power of a BM regarding to the efficie...
Subjects
free text keywords: Computer Science - Neural and Evolutionary Computing, Computer Science - Learning, Statistics - Machine Learning
Download from
36 references, page 1 of 3

[1] Geoffrey E Hinton. Distributed Representations. In James L McClelland and David E Rumelhart, editors, Parallel Distributed Processing. MIT Press, 1986.

[2] Yoshua Bengio. Learning Deep Architectures for AI. Now Publishers Inc, October 2009.

[3] Razvan Pascanu, Guido Montu´far, and Yoshua Bengio. On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv.org, December 2013. [OpenAIRE]

[4] Guido Montu´far, Razvan Pascanu, KyungHyun Cho, and Yoshua Bengio. On the Number of Linear Regions of Deep Neural Networks. In Advances in Neural Information Processing Systems 27, 2014.

[5] Paul Smolensky. Information Processing in Dynamical Systems: Foundations of Harmony Theory. In David E Rumelhart and James L McClelland, editors, Parallel Distributed Processing:Explorations in the Microstructure of Cognition: Foundations, pages 194-281. MIT press, 1986.

[6] Ruslan Salakhutdinov and Geoffrey Hinton. Deep Boltzmann machines. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, pages 448-455, 2009.

[7] Geoffrey E Hinton and Terrence J Sejnowski. Learning and Relearning in Boltzmann Machines. In David E Rumelhart and James L McClelland, editors, Parallel Distributed Processing:Explorations in the Microstructure of Cognition: Foundations, pages 282-317. MIT press, 1986.

[8] James Martens, Arkadev Chattopadhyay, Toniann Pitassi, and Richard Zemel. On the Representational Efficiency of Restricted Boltzmann Machines. In Advances in Neural Information Processing Systems 26, pages 1-21, 2013.

[9] Christopher M Bishop. Pattern Recognition and Machine Learning. Springer, 2006.

[10] Yann LeCun, Le´on Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278-2324, 1998.

[11] Benjamin Marlin and Nando de Freitas. Asymptotic efficiency of deterministic estimators for discrete energy-based models: Ratio matching and pseudolikelihood. arXiv preprint arXiv:1202.3746, 2012. [OpenAIRE]

[12] Fre´de´ric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, and Yoshua Bengio. Theano: new features and speed improvements. arXiv.org, November 2012.

[13] Ian J Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Dumoulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Fre´de´ric Bastien, and Yoshua Bengio. Pylearn2: a machine learning research library. arXiv.org, August 2013.

[14] Tijmen Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. In Proceedings of the 25th International Conference on Machine Learning, pages 1064-1071. ACM, 2008. [OpenAIRE]

[15] Gre´goire Montavon and Klaus-Robert Muller. Deep Boltzmann machines and the centering trick. In gregoire.montavon.name, pages 621-637. Springer, 2012.

36 references, page 1 of 3
Any information missing or wrong?Report an Issue