
arXiv: 1310.1282
We consider the problems of variable selection and estimation in nonparametric additive regression models for high-dimensional data. In recent years, several methods have been proposed to model nonlinear relationships when the number of covariates exceeds the number of observations by using spline basis functions and group penalties. Nonlinear {\it monotone} effects on the response play a central role in many situations, in particular in medicine and biology. We construct the monotone splines lasso (MS-lasso) to select variables and estimate effects using monotone spline (I-splines). The additive components in the model are represented by their I-spline basis function expansion and the component selection becomes that of selecting the groups of coefficients in the I-spline basis function expansion. We use a recent procedure, called cooperative lasso, to select sign-coherent groups, that is selecting the groups with either exclusively non-negative or non-positive coefficients. This leads to the selection of important covariates that have nonlinear monotone increasing or decreasing effect on the response. We also introduce an adaptive version of the MS-lasso which reduces both the bias and the number of false positive selections considerably. We compare the MS-lasso and the adaptive MS-lasso with other existing methods for variable selection in high dimensions by simulation and illustrate the method on two relevant genomic data sets. Results indicate that the (adaptive) MS-lasso has excellent properties compared to the other methods both by means of estimation and selection, and can be recommended for high-dimensional monotone regression.
19 pages, 4 figures and 7 tables
FOS: Computer and information sciences, Ridge regression; shrinkage estimators (Lasso), nonparametric additive models, cooperative Lasso, Methodology (stat.ME), high-dimensional data, monotone regression, Nonparametric regression and quantile regression, Lasso, Computational methods for problems pertaining to statistics, \(I\)-splines, Statistics - Methodology
FOS: Computer and information sciences, Ridge regression; shrinkage estimators (Lasso), nonparametric additive models, cooperative Lasso, Methodology (stat.ME), high-dimensional data, monotone regression, Nonparametric regression and quantile regression, Lasso, Computational methods for problems pertaining to statistics, \(I\)-splines, Statistics - Methodology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
