
arXiv: 1605.08299
We consider the problem of robustifying high-dimensional structured estimation. Robust techniques are key in real-world applications which often involve outliers and data corruption. We focus on trimmed versions of structurally regularized M-estimators in the high-dimensional setting, including the popular Least Trimmed Squares estimator, as well as analogous estimators for generalized linear models and graphical models, using possibly non-convex loss functions. We present a general analysis of their statistical convergence rates and consistency, and then take a closer look at the trimmed versions of the Lasso and Graphical Lasso estimators as special cases. On the optimization side, we show how to extend algorithms for M-estimators to fit trimmed variants and provide guarantees on their numerical convergence. The generality and competitive performance of high-dimensional trimmed estimators are illustrated numerically on both simulated and real-world genomics data.
39 pages, 6 figures
FOS: Computer and information sciences, Ridge regression; shrinkage estimators (Lasso), high-dimensional variable selection, Estimation in multivariate analysis, Machine Learning (stat.ML), robust estimation, Applications of statistics to biology and medical sciences; meta analysis, Statistics - Machine Learning, sparse learning, Robustness and adaptive procedures (parametric inference), Lasso, 65K10, 90C06, 62F35, 47N30, Probabilistic graphical models
FOS: Computer and information sciences, Ridge regression; shrinkage estimators (Lasso), high-dimensional variable selection, Estimation in multivariate analysis, Machine Learning (stat.ML), robust estimation, Applications of statistics to biology and medical sciences; meta analysis, Statistics - Machine Learning, sparse learning, Robustness and adaptive procedures (parametric inference), Lasso, 65K10, 90C06, 62F35, 47N30, Probabilistic graphical models
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 16 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
