
arXiv: 1905.05920
We introduce a hybrid stochastic estimator to design stochastic gradient algorithms for solving stochastic optimization problems. Such a hybrid estimator is a convex combination of two existing biased and unbiased estimators and leads to some useful property on its variance. We limit our consideration to a hybrid SARAH-SGD for nonconvex expectation problems. However, our idea can be extended to handle a broader class of estimators in both convex and nonconvex settings. We propose a new single-loop stochastic gradient descent algorithm that can achieve $O(\max\{��^3\varepsilon^{-1},��\varepsilon^{-3}\})$-complexity bound to obtain an $\varepsilon$-stationary point under smoothness and $��^2$-bounded variance assumptions. This complexity is better than $O(��^2\varepsilon^{-4})$ often obtained in state-of-the-art SGDs when $��< O(\varepsilon^{-3})$. We also consider different extensions of our method, including constant and adaptive step-size with single-loop, double-loop, and mini-batch variants. We compare our algorithms with existing methods on several datasets using two nonconvex models.
41 pages and 18 figures
FOS: Computer and information sciences, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control
FOS: Computer and information sciences, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
