Bayesian Criterion-Based Variable Selection

descriptionPublicationkeyboard_double_arrow_right Article 01 Aug 2021 English Publisher:Oxford University Press (OUP)Journal:Journal of the Royal Statistical Society Series C: Applied Statistics, volume 70, pages 835-857 (issn: 0035-9254, eissn: 1467-9876,

Copyright policy )

Authors: Maity, Arnab Kumar; Basu, Sanjib; Ghosh, Santu;

doi: 10.1111/rssc.12488

pmid: 38863987

pmc: PMC11166016

Bayesian Criterion-Based Variable Selection

- Summary
- Subjects
- Metrics

Abstract

AbstractBayesian approaches for criterion based selection include the marginal likelihood based highest posterior model (HPM) and the deviance information criterion (DIC). The DIC is popular in practice as it can often be estimated from sampling-based methods with relative ease and DIC is readily available in various Bayesian software. We find that sensitivity of DIC-based selection can be high, in the range of 90–100%. However, correct selection by DIC can be in the range of 0–2%. These performances persist consistently with increase in sample size. We establish that both marginal likelihood and DIC asymptotically disfavour under-fitted models, explaining the high sensitivities of both criteria. However, mis-selection probability of DIC remains bounded below by a positive constant in linear models with g-priors whereas mis-selection probability by marginal likelihood converges to 0 under certain conditions. A consequence of our results is that not only the DIC cannot asymptotically differentiate between the data-generating and an over-fitted model, but, in fact, it cannot asymptotically differentiate between two over-fitted models as well. We illustrate these results in multiple simulation studies and in a biomarker selection problem on cancer cachexia of non-small cell lung cancer patients. We further study the performances of HPM and DIC in generalized linear model as practitioners often choose to use DIC that is readily available in software in such non-conjugate settings.

Related Organizations

Georgia Regents University
United States
University of Illinois at Chicago
United States
Pfizer (United States)
United States
Augusta University
United States

Keywords

\(g\)-prior, highest posterior model, Applications of statistics, deviance information criterion, mis-selection

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	12
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%