Variational inference & deep learning: A new synthesis

Doctoral thesis English OPEN
Kingma, D.P.;
(2017)
  • Subject:
    arxiv: Computer Science::Machine Learning

In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
  • References (77)
    77 references, page 1 of 8

    1 introduction and background 1 1.1 Artificial Intelligence 1 1.2 Probabilistic Models and Variational Inference 2 1.2.1 Conditional Models 3 1.3 Parameterizing Conditional Distributions with Neural Networks 1.4 Directed Graphical Models and Neural Networks 5 1.5 Learning in Fully Observed Models with Neural Nets 5 1.5.1 Dataset 5 1.5.2 Maximum Likelihood and Minibatch SGD 6 1.5.3 Bayesian inference 7 1.6 Learning and Inference in Deep Latent Variable Models 7 1.6.1 Latent Variables 7 1.6.2 Deep Latent Variable Models 8 1.6.3 Example DLVM for multivariate Bernoulli data 9 1.7 Intractabilities 9 1.8 Research Questions and Contributions 10

    2 variational autoencoders 13 2.1 Introduction 13

    2.2 Encoder or Approximate Posterior 13 2.3 Evidence Lower Bound (ELBO) 14 2.3.1 A Double-Edged Sword 17 2.4 Stochastic Gradient-Based Optimization of the ELBO 2.5 Reparameterization Trick 18 2.5.1 Change of variables 20 2.5.2 Gradient of expectation under change of variable 2.5.3 Gradient of ELBO 21 2.5.4 Computation of log qf(zjz) 22 2.6 Factorized Gaussian posteriors 23 2.6.1 Full-covariance Gaussian posterior 23 2.7 Estimation of the Marginal Likelihood 25 2.8 Marginal Likelihood and ELBO as KL Divergences 5 inverse autoregressive flow 73 5.1 Requirements for Computational Tractability 5.2 Improving the Flexibility of Inference Models 5.2.1 Auxiliary Latent Variables 74 5.2.2 Normalizing Flows 75 5.3 Inverse Autoregressive Transformations 5.4 Inverse Autoregressive Flow (IAF) 78 5.5 Related work 82 5.6 Experiments 83 5.6.1 MNIST 83 5.6.2 CIFAR-10 85 5.7 Conclusion 86

    Ryan Prescott Adams and Zoubin Ghahramani. Archipelago: nonparametric Bayesian semi-supervised learning. In Proceedings of the International Conference on Machine Learning (ICML), 2009.

    Sungjin Ahn, Anoop Korattikara, and Max Welling. Bayesian posterior sampling via stochastic gradient Fisher scoring. arXiv preprint arXiv:1206.6380, 2012.

    Shun-Ichi Amari. Natural gradient works efficiently in learning. Neural computation, 10(2):251-276, 1998.

    Jimmy Ba and Brendan Frey. Adaptive dropout for training deep neural networks. In Advances in Neural Information Processing Systems, pages 3084- 3092, 2013.

    Justin Bayer, Maximilian Karol, Daniela Korhammer, and Patrick Van der Smagt. Fast adaptive weight noise. arXiv preprint arXiv:1507.05331, 2015.

    Mikhail Belkin and Partha Adviser-Niyogi. Problems of learning on manifolds. 2003.

    Yoshua Bengio. Estimating or propagating gradients through stochastic neurons. arXiv preprint arXiv:1305.2982, 2013.

  • Related Research Results (1)
  • Metrics
Share - Bookmark