
doi: 10.1002/wics.1327
Selecting among a large set of variables those that influence most a response variable is an important problem in statistics. When the assumed regression model involves a nonparametric component, penalized regression techniques, and in particular P‐splines, are among the commonly used methods. The aim of this paper is to provide a brief review of variable selection methods using P‐splines. Starting from multiple linear regression models, with least‐squares regression, and Ridge regression, we review standard methods that perform variable selection, such as LASSO, nonnegative garrote, the SCAD method, etc. We briefly discuss a general framework of penalization and regularization methods. Going toward more flexible regression models, with some nonparametric component(s), we discuss P‐splines estimation. For some examples of flexible regression models, we then review a few variable selection methods using P‐splines. A brief discussion on grouped regularization techniques and on a robust variable selection method is given. Furthermore, we mention key ingredients in Bayesian approaches, and end the paper by drawing the attention to several other issues in variable selection with P‐splines. Throughout the paper we provide some illustrations.WIREs Comput Stat2015, 7:1–20. doi: 10.1002/wics.1327This article is categorized under:Statistical and Graphical Methods of Data Analysis > Nonparametric MethodsStatistical Models > Model Selection
varying coefficient models, regularization techniques, P-splines, linear regression, ridge regression, robust variable selection, additive regression, Computational methods for problems pertaining to statistics, variable selection
varying coefficient models, regularization techniques, P-splines, linear regression, ridge regression, robust variable selection, additive regression, Computational methods for problems pertaining to statistics, variable selection
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
