
The Gibbs-Boltzmann distribution offers a physically interpretable way to massively reduce the dimensionality of high dimensional probability distributions where the extensive variables are `features' and the intensive variables are `descriptors'. However, not all probability distributions can be modeled using the Gibbs-Boltzmann form. Here, we present TMI: TMI, {\bf T}hermodynamic {\bf M}anifold {\bf I}nference; a thermodynamic approach to approximate a collection of arbitrary distributions. TMI simultaneously learns from data intensive and extensive variables and achieves dimensionality reduction through a multiplicative, positive valued, and interpretable decomposition of the data. Importantly, the reduced dimensional space of intensive parameters is not homogeneous. The Gibbs-Boltzmann distribution defines an analytically tractable Riemannian metric on the space of intensive variables allowing us to calculate geodesics and volume elements. We discuss the applications of TMI with multiple real and artificial data sets. Possible extensions are discussed as well.
FOS: Computer and information sciences, Computer Science - Machine Learning, Statistical Mechanics (cond-mat.stat-mech), Physics, QC1-999, FOS: Physical sciences, Quantitative Biology - Quantitative Methods, Machine Learning (cs.LG), FOS: Biological sciences, Condensed Matter - Statistical Mechanics, Quantitative Methods (q-bio.QM)
FOS: Computer and information sciences, Computer Science - Machine Learning, Statistical Mechanics (cond-mat.stat-mech), Physics, QC1-999, FOS: Physical sciences, Quantitative Biology - Quantitative Methods, Machine Learning (cs.LG), FOS: Biological sciences, Condensed Matter - Statistical Mechanics, Quantitative Methods (q-bio.QM)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
