Comparing the Accuracy of Three Predictive Information  Criteria for Bayesian Linear Multilevel Model Selection

Due in part to recent advances in software, such as Stan and the popular brms package in R, Bayesian multilevel modeling techniques have become increasingly popular. As researchers leverage these techniques to fit new models, information criteria—fit indices which provide information about a model’s fit to the data—play an important role in disambiguating between competing models. However, a systematic evaluation of these Bayesian criteria in a multilevel context has not yet been undertaken. Here, using simulation, we investigate the model selection accuracy of three popular information criteria: the deviance information criteria (DIC), Watanabe-Akaike information criterion (WAIC), and leave-one-out cross-validation information criterion (LOO-CV). We manipulated the following factors to determine how they affected these indices’ accuracy in identifying the data generation model: 1) the number of groups, 2) number of observations, 3) variance of the random effect, 4) fixed effects magnitude, 5) model misspecification, and 6) model selection strategy. In general, WAIC and LOO-CV outperformed DIC and are recommended when computationally feasible. In addition, we argue that a selection strategy that simply chooses the model with the lowest information criteria—a practice that is frequently employed in applied research—may result in overfitting. Recommendations for best practices regarding Bayesian multilevel modeling selection are provided.

Related Organizations

ROYAL INSTITUTION FOR THE ADVANCEMENT OF LEARNING MCGILL UNIVERSITY
Canada
McGill University
Canada
Loyola University Chicago
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

hybrid

Funded by

NSERC| unidentified

Related to Research communities

UArctic