Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

Name: Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization
Creator: Mairal, Julien
Keywords: majorization-minimization, FOS: Computer and information sciences, Computer Science - Machine Learning, [MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC], Machine Learning (stat.ML), [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], 02 engineering and technology, [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], Machine Learning (cs.LG), surrogate functions

Mairal, Julien

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2013

Data sources: arXiv.org e-Print Archive

INRIA2

Conference object . 2013

Data sources: INRIA2

INRIA a CCSD electronic archive server

Conference object . 2013

Data sources: INRIA a CCSD electronic archive server

https://dx.doi.org/10.48550/ar...

Article . 2013

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2013Embargo end date: 01 Jan 2013Publisher:arXiv

Authors: Mairal, Julien;

doi: 10.48550/arxiv.1306.4650

arXiv: 1306.4650

Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

- Summary
- Subjects
- Metrics

Abstract

Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal processing. In this paper, we intend to make this principle scalable. We introduce a stochastic majorization-minimization scheme which is able to deal with large-scale or possibly infinite data sets. When applied to convex optimization problems under suitable assumptions, we show that it achieves an expected convergence rate of $O(1/\sqrt{n})$ after $n$ iterations, and of $O(1/n)$ for strongly convex functions. Equally important, our scheme almost surely converges to stationary points for a large class of non-convex problems. We develop several efficient algorithms based on our framework. First, we propose a new stochastic proximal gradient method, which experimentally matches state-of-the-art solvers for large-scale $\ell_1$-logistic regression. Second, we develop an online DC programming algorithm for non-convex sparse estimation. Finally, we demonstrate the effectiveness of our approach for solving large-scale structured matrix factorization problems.

accepted for publication for Neural Information Processing Systems (NIPS) 2013. This is the 9-pages version followed by 16 pages of appendices. The title has changed compared to the first technical report

Related Organizations

Grenoble Alpes University
France
French Institute for Research in Computer Science and Automation
France
Grenoble INP - UGA
France
Laboratoire Jean Kuntzmann
France
French National Centre for Scientific Research
France

Keywords

majorization-minimization, FOS: Computer and information sciences, Computer Science - Machine Learning, [MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC], Machine Learning (stat.ML), [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], Machine Learning (cs.LG), surrogate functions, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, optimization, Mathematics - Optimization and Control

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Related to Research communities

INRIA

University Network for Innovation, Technology and Engineering