Regularized stochastic BFGS algorithm

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Dec 2013Publisher:IEEEJournal:2013 IEEE Global Conference on Signal and Information Processing

Authors: Aryan Mokhtari; Alejandro Ribeiro;

doi: 10.1109/globalsip.2013.6737088

Regularized stochastic BFGS algorithm

- Summary
- Metrics

Abstract

A regularized stochastic version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method is proposed to solve optimization problems with stochastic objectives that arise in large scale machine learning. Stochastic gradient descent is the currently preferred solution methodology but the number of iterations required to approximate optimal arguments can be prohibitive in high dimensional problems. BFGS modifies gradient descent by introducing a Hessian approximation matrix computed from finite gradient differences. This paper utilizes stochastic gradient differences and introduces a regularization to ensure that the Hessian approximation matrix remains well conditioned. The resulting regularized stochastic BFGS method is shown to converge to optimal arguments almost surely over realizations of the stochastic gradient sequence. Numerical experiments showcase reductions in convergence time relative to stochastic gradient descent algorithms and non-regularized stochastic versions of BFGS.

Related Organizations

University of Pennsylvania
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	5
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average