publication . Preprint . 2016

Information Theoretic-Learning Auto-Encoder

Santana, Eder; Emigh, Matthew; Principe, Jose C;
Open Access English
  • Published: 21 Mar 2016
Abstract
We propose Information Theoretic-Learning (ITL) divergence measures for variational regularization of neural networks. We also explore ITL-regularized autoencoders as an alternative to variational autoencoding bayes, adversarial autoencoders and generative adversarial networks for randomly generating sample data without explicitly defining a partition function. This paper also formalizes, generative moment matching networks under the ITL framework.
Subjects
arXiv: Computer Science::Machine LearningComputer Science::Neural and Evolutionary ComputationStatistics::Machine Learning
ACM Computing Classification System: Data_CODINGANDINFORMATIONTHEORY
free text keywords: Computer Science - Learning
Download from
19 references, page 1 of 2

[1] Ian Goodfellow, Aaron Courville, and Yoshua Bengio, “Deep learning,” Book in preparation for MIT Press.(Cited on page 159), 2015.

[2] Yann LeCun, Le´on Bottou, Yoshua Bengio, and Patrick Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.

[3] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.

[4] Diederik P Kingma and Max Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.

[5] Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra, “Stochastic backpropagation and approximate inference in deep generative models,” arXiv preprint arXiv:1401.4082, 2014.

[6] Jose C Principe, Information theoretic learning: Renyi's entropy and kernel perspectives, Springer Science & Business Media, 2010.

[7] W Keith Hastings, “Monte carlo sampling methods using markov chains and their applications,” Biometrika, vol. 57, no. 1, pp. 97-109, 1970.

[8] Tejas D Kulkarni, Will Whitney, Pushmeet Kohli, and Joshua B Tenenbaum, “Deep convolutional inverse graphics network,” arXiv preprint arXiv:1503.03167, 2015.

[9] Emily L Denton, Soumith Chintala, Rob Fergus, et al., “Deep generative image models using a? laplacian pyramid of adversarial networks,” in Advances in Neural Information Processing Systems, 2015, pp. 1486-1494. [OpenAIRE]

[10] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014, pp. 2672-2680.

[11] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian Goodfellow, “Adversarial autoencoders,” arXiv preprint arXiv:1511.05644, 2015.

[12] Jose C Principe, Dongxin Xu, and John Fisher, “Information theoretic learning,” Unsupervised adaptive filtering, vol. 1, pp. 265-319, 2000.

[13] ALFRPED Rrnyi, “On measures of entropy and information,” in Fourth Berkeley symposium on mathematical statistics and probability, 1961, vol. 1, pp. 547-561.

[14] Bernard W Silverman, Density estimation for statistics and data analysis, vol. 26, CRC press, 1986.

[15] Arthur Gretton, Karsten M Borgwardt, Malte Rasch, Bernhard Scho¨ lkopf, and Alex J Smola, “A kernel method for the two-sample-problem,” in Advances in neural information processing systems, 2006, pp. 513-520.

19 references, page 1 of 2
Related research
Abstract
We propose Information Theoretic-Learning (ITL) divergence measures for variational regularization of neural networks. We also explore ITL-regularized autoencoders as an alternative to variational autoencoding bayes, adversarial autoencoders and generative adversarial networks for randomly generating sample data without explicitly defining a partition function. This paper also formalizes, generative moment matching networks under the ITL framework.
Subjects
arXiv: Computer Science::Machine LearningComputer Science::Neural and Evolutionary ComputationStatistics::Machine Learning
ACM Computing Classification System: Data_CODINGANDINFORMATIONTHEORY
free text keywords: Computer Science - Learning
Download from
19 references, page 1 of 2

[1] Ian Goodfellow, Aaron Courville, and Yoshua Bengio, “Deep learning,” Book in preparation for MIT Press.(Cited on page 159), 2015.

[2] Yann LeCun, Le´on Bottou, Yoshua Bengio, and Patrick Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.

[3] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.

[4] Diederik P Kingma and Max Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.

[5] Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra, “Stochastic backpropagation and approximate inference in deep generative models,” arXiv preprint arXiv:1401.4082, 2014.

[6] Jose C Principe, Information theoretic learning: Renyi's entropy and kernel perspectives, Springer Science & Business Media, 2010.

[7] W Keith Hastings, “Monte carlo sampling methods using markov chains and their applications,” Biometrika, vol. 57, no. 1, pp. 97-109, 1970.

[8] Tejas D Kulkarni, Will Whitney, Pushmeet Kohli, and Joshua B Tenenbaum, “Deep convolutional inverse graphics network,” arXiv preprint arXiv:1503.03167, 2015.

[9] Emily L Denton, Soumith Chintala, Rob Fergus, et al., “Deep generative image models using a? laplacian pyramid of adversarial networks,” in Advances in Neural Information Processing Systems, 2015, pp. 1486-1494. [OpenAIRE]

[10] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014, pp. 2672-2680.

[11] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian Goodfellow, “Adversarial autoencoders,” arXiv preprint arXiv:1511.05644, 2015.

[12] Jose C Principe, Dongxin Xu, and John Fisher, “Information theoretic learning,” Unsupervised adaptive filtering, vol. 1, pp. 265-319, 2000.

[13] ALFRPED Rrnyi, “On measures of entropy and information,” in Fourth Berkeley symposium on mathematical statistics and probability, 1961, vol. 1, pp. 547-561.

[14] Bernard W Silverman, Density estimation for statistics and data analysis, vol. 26, CRC press, 1986.

[15] Arthur Gretton, Karsten M Borgwardt, Malte Rasch, Bernhard Scho¨ lkopf, and Alex J Smola, “A kernel method for the two-sample-problem,” in Advances in neural information processing systems, 2006, pp. 513-520.

19 references, page 1 of 2
Related research
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue