publication . Preprint . 2018

Nonparametric Bayesian Deep Networks with Local Competition

Panousis, Konstantinos P.; Chatzis, Sotirios; Theodoridis, Sergios;
Open Access English
  • Published: 19 May 2018
Abstract
The aim of this work is to enable inference of deep networks that retain high accuracy for the least possible model complexity, with the latter deduced from the data during inference. To this end, we revisit deep networks that comprise competing linear units, as opposed to nonlinear units that do not entail any form of (local) competition. In this context, our main technical innovation consists in an inferential setup that leverages solid arguments from Bayesian nonparametrics. We infer both the needed set of connections or locally competing sets of units, as well as the required floating-point precision for storing the network parameters. Specifically, we intro...
Subjects
free text keywords: Computer Science - Machine Learning, Statistics - Machine Learning
Download from
20 references, page 1 of 2

[16] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607-609, Jun 1996. [OpenAIRE]

[17] Wolfgang Maass. Neural computation with winner-take-all as the only nonlinear operation. In Proc. NIPS 12, pages 293-299, Cambridge, MA, USA, 1999. MIT Press.

[18] W. Maass. On the computational power of winner-take-all. Neural Comput, 12(11):2519-2535, Nov 2000.

[19] Rupesh K Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, and Jürgen Schmidhuber. Compete to compute. In Proc. NIPS 26, pages 2310-2318. Curran Associates, Inc., 2013.

[20] S. Grossberg. The art of adaptive pattern recognition by a self-organizing neural network. Computer, pages 77-88, 1988.

[21] Michael McCloskey and Neal J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. volume 24 of Psychology of Learning and Motivation, pages 109 - 165. Academic Press, 1989. [OpenAIRE]

[22] Thomas L. Griffiths and Zoubin Ghahramani. Infinite latent feature models and the indian buffet process. In Proc. NIPS, pages 475-482. MIT Press, 2005.

[23] Y. W. Teh, D. Görür, and Z. Ghahramani. Stick-breaking construction for the Indian buffet process. In Proc. AISTATS, volume 11, 2007. [OpenAIRE]

[24] D. P. Kingma and M. Welling. Auto-encoding variational Bayes. In Proc. ICLR, 2014.

[25] P. Kumaraswamy. A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1):79 - 88, 1980. [OpenAIRE]

[26] Eric Nalisnick and Padhraic Smyth. Stick-breaking variational autoencoders. In Proc. ICLR, 2016. [OpenAIRE]

[27] Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization using gumbel-softmax. In Proc. ICLR, 2017.

[28] Chris J. Maddison, Andriy Mnih, and Yee Whye Teh. The concrete distribution: A continuous relaxation of discrete random variables. In Proc. ICLR, 2017. [OpenAIRE]

[29] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

[30] Kirill Neklyudov, Dmitry Molchanov, Arsenii Ashukha, and Dmitry P Vetrov. Structured bayesian pruning via log-normal multiplicative noise. In Proc. NIPS 31, pages 6775-6784. 2017. [OpenAIRE]

20 references, page 1 of 2
Abstract
The aim of this work is to enable inference of deep networks that retain high accuracy for the least possible model complexity, with the latter deduced from the data during inference. To this end, we revisit deep networks that comprise competing linear units, as opposed to nonlinear units that do not entail any form of (local) competition. In this context, our main technical innovation consists in an inferential setup that leverages solid arguments from Bayesian nonparametrics. We infer both the needed set of connections or locally competing sets of units, as well as the required floating-point precision for storing the network parameters. Specifically, we intro...
Subjects
free text keywords: Computer Science - Machine Learning, Statistics - Machine Learning
Download from
20 references, page 1 of 2

[16] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607-609, Jun 1996. [OpenAIRE]

[17] Wolfgang Maass. Neural computation with winner-take-all as the only nonlinear operation. In Proc. NIPS 12, pages 293-299, Cambridge, MA, USA, 1999. MIT Press.

[18] W. Maass. On the computational power of winner-take-all. Neural Comput, 12(11):2519-2535, Nov 2000.

[19] Rupesh K Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, and Jürgen Schmidhuber. Compete to compute. In Proc. NIPS 26, pages 2310-2318. Curran Associates, Inc., 2013.

[20] S. Grossberg. The art of adaptive pattern recognition by a self-organizing neural network. Computer, pages 77-88, 1988.

[21] Michael McCloskey and Neal J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. volume 24 of Psychology of Learning and Motivation, pages 109 - 165. Academic Press, 1989. [OpenAIRE]

[22] Thomas L. Griffiths and Zoubin Ghahramani. Infinite latent feature models and the indian buffet process. In Proc. NIPS, pages 475-482. MIT Press, 2005.

[23] Y. W. Teh, D. Görür, and Z. Ghahramani. Stick-breaking construction for the Indian buffet process. In Proc. AISTATS, volume 11, 2007. [OpenAIRE]

[24] D. P. Kingma and M. Welling. Auto-encoding variational Bayes. In Proc. ICLR, 2014.

[25] P. Kumaraswamy. A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1):79 - 88, 1980. [OpenAIRE]

[26] Eric Nalisnick and Padhraic Smyth. Stick-breaking variational autoencoders. In Proc. ICLR, 2016. [OpenAIRE]

[27] Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization using gumbel-softmax. In Proc. ICLR, 2017.

[28] Chris J. Maddison, Andriy Mnih, and Yee Whye Teh. The concrete distribution: A continuous relaxation of discrete random variables. In Proc. ICLR, 2017. [OpenAIRE]

[29] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

[30] Kirill Neklyudov, Dmitry Molchanov, Arsenii Ashukha, and Dmitry P Vetrov. Structured bayesian pruning via log-normal multiplicative noise. In Proc. NIPS 31, pages 6775-6784. 2017. [OpenAIRE]

20 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue