
arXiv: 1904.03537
Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restrictive rules on the extrapolation parameter. In this paper, we show that the extrapolation parameter can be controlled by locally finding also a simple concave lower bound of the objective function. This gives rise to a double convex-concave backtracking procedure which allows for an adaptive choice of both the step size and extrapolation parameters. We apply this procedure to the class of inertial Bregman proximal gradient methods, and prove that any sequence generated by these algorithms converges globally to a critical point of the function at hand. Numerical experiments on a number of challenging non-convex problems in image processing and machine learning were conducted and show the power of combining inertial step and double backtracking strategy in achieving improved performances.
29 pages
Decomposition methods, FOS: Computer and information sciences, Convex programming, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG), Convex functions and convex programs in convex geometry, Numerical mathematical programming methods, non-Euclidean distances, FOS: Mathematics, 90C25, 26B25, 49M27, 52A41, 65K05, proximal gradient algorithms, convex-concave backtracking, Mathematics - Numerical Analysis, Mathematics - Optimization and Control, Convexity of real functions of several variables, generalizations, inertial methods, Numerical Analysis (math.NA), Kurdyka-Łojasiewicz property, Bregman distance, composite minimization, global convergence, Optimization and Control (math.OC)
Decomposition methods, FOS: Computer and information sciences, Convex programming, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG), Convex functions and convex programs in convex geometry, Numerical mathematical programming methods, non-Euclidean distances, FOS: Mathematics, 90C25, 26B25, 49M27, 52A41, 65K05, proximal gradient algorithms, convex-concave backtracking, Mathematics - Numerical Analysis, Mathematics - Optimization and Control, Convexity of real functions of several variables, generalizations, inertial methods, Numerical Analysis (math.NA), Kurdyka-Łojasiewicz property, Bregman distance, composite minimization, global convergence, Optimization and Control (math.OC)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 25 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
