SOCP relaxation bounds for the optimal subset selection problem applied to robust linear regression
Mathematics - Optimization and Control | Mathematics - Statistics Theory
This paper deals with the problem of finding the globally optimal subset of h elements from a larger set of n elements in d space dimensions so as to minimize a quadratic criterion, with an special emphasis on applications to computing the Least Trimmed Squares Estimator (LTSE) for robust regression. The computation of the LTSE is a challenging subset selection problem involving a nonlinear program with continuous and binary variables, linked in a highly nonlinear fashion. The selection of a globally optimal subset using the branch and bound (BB) algorithm is limited to problems in very low dimension, tipically d<5, as the complexity of the problem increases exponentially with d. We introduce a bold pruning strategy in the BB algorithm that results in a significant reduction in computing time, at the price of a negligeable accuracy lost. The novelty of our algorithm is that the bounds at nodes of the BB tree come from pseudo-convexifications derived using a linearization technique with approximate bounds for the nonlinear terms. The approximate bounds are computed solving an auxiliary semidefinite optimization problem. We show through a computational study that our algorithm performs well in a wide set of the most difficult instances of the LTSE problem.