
arXiv: 1101.0891
Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.
Published in at http://dx.doi.org/10.1214/10-STS330 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)
FOS: Computer and information sciences, predictive power, causality, data mining, Methodology (stat.ME), Foundations and philosophical topics in statistics, scientific research, explanatory modeling, statistical strategy, Explanatory modeling, predictive modeling, Statistics - Methodology
FOS: Computer and information sciences, predictive power, causality, data mining, Methodology (stat.ME), Foundations and philosophical topics in statistics, scientific research, explanatory modeling, statistical strategy, Explanatory modeling, predictive modeling, Statistics - Methodology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2K | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 0.01% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 0.1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
