SOCP relaxation bounds for the optimal subset selection problem applied to robust linear regression

Preprint English OPEN
Flores, Salvador (2015)

This paper deals with the problem of finding the globally optimal subset of h elements from a larger set of n elements in d space dimensions so as to minimize a quadratic criterion, with an special emphasis on applications to computing the Least Trimmed Squares Estimator (LTSE) for robust regression. The computation of the LTSE is a challenging subset selection problem involving a nonlinear program with continuous and binary variables, linked in a highly nonlinear fashion. The selection of a globally optimal subset using the branch and bound (BB) algorithm is limited to problems in very low dimension, tipically d<5, as the complexity of the problem increases exponentially with d. We introduce a bold pruning strategy in the BB algorithm that results in a significant reduction in computing time, at the price of a negligeable accuracy lost. The novelty of our algorithm is that the bounds at nodes of the BB tree come from pseudo-convexifications derived using a linearization technique with approximate bounds for the nonlinear terms. The approximate bounds are computed solving an auxiliary semidefinite optimization problem. We show through a computational study that our algorithm performs well in a wide set of the most difficult instances of the LTSE problem.
  • References (24)
    24 references, page 1 of 3

    Adams, W. P., Forrester, R. J., Glover, F. W., 2004. Comparisons and enhancement strategies for linearizing mixed 0-1 quadratic programs. Discrete optimization 1 (2), 99-120.

    Adams, W. P., Sherali, H. D., 1990. Linearization strategies for a class of zero-one mixed integer programming problems. Operations Research 38 (2), 217-226.

    Adams, W. P., Sherali, H. D., 1993. Mixed-integer bilinear programming problems. Mathematical Programming 59 (3), 279-305.

    Agulló, J., 2001. New algorithms for computing the least trimmed squares regression estimator. Computational Statistics & Data Analysis 36 (4), 425-439.

    Bernholt, T., 2005. Computing the least median of squares estimator in time O(nd). In: Gervasi, O., Gavrilova, M. L., Kumar, V., Lagana, A., Lee, H. P., Mun, Y., Taniar, D., Tan, C. (Eds.), Computational Science and Its Applications - ICCSA 2005. Vol. 3480 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 697-706.

    Bertsimas, D., Mazumder, R., 12 2014. Least quantile regression via modern optimization. Ann. Statist. 42 (6), 2494-2525.

    Chen, X., AUG 2003. An improved branch and bound algorithm for feature selection. Pattern Recognition Letters 24 (12), 1925-1933.

    Donoho, D., Huber, P. J., 1983. The notion of breakdown point. In: A Festschrift for Erich L. Lehmann. Wadsworth Statist./Probab. Ser. Wadsworth, pp. 157-184.

    Erickson, J., Har-Peled, S., Mount, D. M., 2006. On the least median square problem. Discrete Comput. Geom. 36 (4), 593-607.

    Giloni, A., Padberg, M., 2002. Least trimmed squares regression, least median squares regression, and mathematical programming. Mathematical and Computer Modelling 35 (9-10), 1043 -1060.

  • Metrics
    No metrics available
Share - Bookmark