Anchored Bayesian Gaussian mixture models

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 01 Jan 2020Embargo end date: 01 Jan 2018Publisher:Institute of Mathematical StatisticsJournal:Electronic Journal of Statistics, volume 14 (issn: 1935-7524,

Copyright policy )Funded by:NSF | Bayesian Empirical Likeli..., NSF | New Methods for the Analy...

Authors: Kunkel, Deborah; Peruggia, Mario;

doi: 10.1214/20-ejs1756 , 10.48550/arxiv.1805.08304

arXiv: 1805.08304

Anchored Bayesian Gaussian mixture models

- Summary
- Subjects
- Metrics

Abstract

Finite mixtures are a flexible modeling tool for irregularly shaped densities and samples from heterogeneous populations. When modeling with mixtures using an exchangeable prior on the component features, the component labels are arbitrary and are indistinguishable in posterior analysis. This makes it impossible to attribute any meaningful interpretation to the marginal posterior distributions of the component features. We propose a model in which a small number of observations are assumed to arise from some of the labeled component densities. The resulting model is not exchangeable, allowing inference on the component features without post-processing. Our method assigns meaning to the component labels at the modeling stage and can be justified as a data-dependent informative prior on the labelings. We show that our method produces interpretable results, often (but not always) similar to those resulting from relabeling algorithms, with the added benefit that the marginal inferences originate directly from a well specified probability model rather than a post hoc manipulation. We provide asymptotic results leading to practical guidelines for model selection that are motivated by maximizing prior information about the class labels and demonstrate our method on real and simulated data.

65 pages, 11 figures, 11 tables

Related Organizations

The Ohio State University
United States
Clemson University
United States
The Ohio State University at Marion
United States

Keywords

Methodology (stat.ME), FOS: Computer and information sciences, Classification and discrimination; cluster analysis (statistical aspects), Statistical astronomy, label switching, Label switching, Applications of statistics to physics, identifiability, EM algorithm, data-dependent prior, Statistics - Methodology

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

Green

gold

Fields of Science

Fields of Science

Funded by

NSF| Bayesian Empirical Likelihood: Data Analysis Tools with Applications in Econometrics, NSF| New Methods for the Analysis of Human Performance Data