
In a learning context, data distribution are usually unknown. Observation models are also sometimes complex. In an inverse problem setup, these facts often lead to the minimization of a loss function with uncertain analytic expression. Consequently, its gradient cannot be evaluated in an exact manner. These issues have has promoted the development of so-called stochastic optimization methods, which are able to cope with stochastic errors in the gradient term. A natural strategy is to start from a deterministic optimization approach as a baseline, and to incorporate a stabilization procedure (e.g., decreasing stepsize, averaging) that yields improved robustness to stochastic errors. In the context of large-scale, differentiable optimization, an important class of methods relies on the principle of majorization-minimization (MM). MM algorithms are becoming increasingly popular in signal/image processing and machine learning. MM approaches are fast, stable, require limited manual settings, and are often preferred by practitioners in application domains such as medical imaging and telecommunications. The present work introduces novel theoretical convergence guarantees for MM algorithms when approximate gradient terms are employed, generalizing some recent work to a wider class of functions and algorithms. We illustrate our theoretical results with a binary classification problem.
Stochastic optimization, binary logistic regression, subspace acceleration, Majorization-Minimization, convergence analysis, [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
Stochastic optimization, binary logistic regression, subspace acceleration, Majorization-Minimization, convergence analysis, [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
