Studying the rate of convergence of gradient optimisation algorithms via the theory of optimal experimental design

Doctoral thesis English OPEN
Haycroft, Rebecca Jane
  • Subject: QA

The most common class of methods for solving quadratic optimisation problems is the class of gradient algorithms, the most famous of which being the Steepest Descent algorithm. The development of a particular gradient algorithm, the Barzilai-Borwein algorithm, has sparked a lot of research in the area in recent years and many algorithms now exist which have faster rates of convergence than that possessed by the Steepest Descent algorithm. The technology to effectively analyse and compare the asymptotic rates of convergence of gradient algorithms is, however, limited and so it is somewhat unclear from literature as to which algorithms possess the faster rates of convergence. In this thesis methodology is developed to enable better analysis of the asymptotic rates of convergence of gradient algorithms applied to quadratic optimisation problems. This methodology stems from a link with the theory of optimal experimental design. It is established that gradient algorithms can be related to algorithms for constructing optimal experimental designs for linear regression models. Furthermore, the asymptotic rates of convergence of these gradient algorithms can be expressed through the asymptotic behaviour of multiplicative algorithms for constructing optimal experimental designs. The described connection to optimal experimental design has also been used to influence the creation of several new gradient algorithms which would not have otherwise been intuitively thought of. The asymptotic rates of convergence of these algorithms are studied extensively and insight is given as to how some gradient algorithms are able to converge faster than others. It is demonstrated that the worst rates are obtained when the corresponding multiplicative procedure for updating the designs converges to the optimal design. Simulations reveal that the asymptotic rates of convergence of some of these new algorithms compare favourably with those of existing gradient-type algorithms such as the Barzilai-Borwein algorithm.
  • References (33)
    33 references, page 1 of 4

    1 In tro d u c tio n 1 1.1 Convex O ptim isation................................................................................. 1 1.1.1 Convexity and Concavity............................................................... 3 1.1.2 Quadratic Optimisation ............................................................... 4 1.1.3 Iterative M ethods........................................................................... 4 1.2 Steepest D e s c e n t....................................................................................... 7 1.3 The Conjugate Gradient M e th o d ............................................................ 11 1.4 Iterative Methods for Solving Linear Systems ...................................... 13 1.4.1 Stationary Iterative Methods ...................................................... 14 1.4.2 Krylov Subspace M e th o d s ............................................................ 16 1.5 Recent Successful Gradient A lgorithm s.................................................. 18 1.5.1 The Barzilai-Borwein A lgorithm ................................................... 18 1.5.2 Generalisations of the Barzilai-Borwein A lgorithm .................... 19 1.5.3 Other New Gradient M e th o d s..........................................................26 1.6 Motivation of T h e s i s ..................................................................................... 30

    2 G rad ien t A lgorithm s an d th e ir R elation to O ptim al D esign T h e o ry 32 2.1 Renormalised Versions of Gradient A lgorithm s..........................................32 2.1.1 Renormalisation of the Steepest Descent A lg o rith m .................... 32 2.1.2 Renormalisation of a General Gradient A lgorithm ....................... 35 2.2 Asymptotic Behaviour of the Steepest Descent A lgorithm ....................... 35 2.3 Relation to Optimal Experimental Design ................................................ 44 2.3.1 Optimal Experimental Design Terminology....................................45 Constructing Gradient Algorithms which Correspond to Given Optimality C rite ria ........................................................................... 48 Rate of Convergence of Gradient Algorithms Corresponding to Optimal D esigns........................................................................... 50

    3 T h e 7 -S teep est D escent A lgorithm 53 3.1 The Renormalised 7 -Steepest Descent Algorithm .......................................54 3.1.1 Concavity of the Optimality Criterion............................................. 55 3.1.2 Convergence to an Optimal D e s ig n ................................................ 56 3.1.3 Speed of Convergence to the Optimum D esig n .............................60 3.1.4 Behaviour of the Sequence (4>(£^)} 61 3.2 Asymptotic Rate of Convergence ............................................................... 71 3.2.1 Dependence on 7 ............................................................................... 72 3.2.2 Dependence on p ............................................................................... 75 3.2.3 Dependence on d ............................................................................... 79 3.2.4 Behaviour of r ^ ............................................................................... 83 3.3 7 -Steepest Descent Summary ...................................................................... 92

    [10] B o x , M. J ., D. D., a n d S w a n n , W . H. Non-Linear Optimization Techniques. Oliver & Boyd Ltd., Edinburgh, 1969.

    [11] B u n d a y , B. D. Basic Optimisation Methods. Edward Arnold Ltd., London, 1984.

    [12] C a u c h y , A. Methode generate pour la resolution des systems d 'equations simulatanees. Comp. Rend. Sci. 25 (1847), 536-538.

    [13] C h a m b e r l a i n , R. M., P o w e l l , M. J. D ., L e m a r e c h a l , C., a n d P e d ­ e r s e n , H. C. The watchdog technique for forcing convergence in algorithms for constrained optimization. Math. Programming Stud., 16 (1982), 1-17. Algorithms for constrained minimization of smooth nonlinear functions.

    [14] C u r r y , H. B. The method of steepest descent for non-linear minimization problems. Quart. Appl. Math. 2 (1944), 258-261.

    [15] D a i , Y ., Y u a n , J ., a n d Y u a n , Y.-X . Modified two-point stepsize gradient methods for unconstrained optimization. Comput. Optim. Appl. 22, 1 (2002), 103-109.

    [16] D a i , Y.-H . Alternate step gradient method. Optimization 52, 4-5 (2003), 395-415. Theory, methods and applications of optimization.

  • Similar Research Results (5)
  • Metrics
    0
    views in OpenAIRE
    0
    views in local repository
    12
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    Online Research @ Cardiff - IRUS-UK 0 12
Share - Bookmark