
arXiv: 2208.08425
To increase the training speed of distributed learning, recent years have witnessed a significant amount of interest in developing both synchronous and asynchronous distributed stochastic variance-reduced optimization methods. However, all existing synchronous and asynchronous distributed training algorithms suffer from various limitations in either convergence speed or implementation complexity. This motivates us to propose an algorithm called STNTHESIS (semi-asynchronous path-integrated stochastic gradient search), which leverages the special structure of the variance-reduction framework to overcome the limitations of both synchronous and asynchronous distributed learning algorithms while retaining their salient features. We consider two implementations of STNTHESIS under distributed and shared memory architectures. We show that our STNTHESIS algorithms have $O(\sqrt{N}ε^{-2}(Δ+1)+N)$ and $O(\sqrt{N}ε^{-2}(Δ+1) d+N)$ computational complexities for achieving an $ε$-stationary point in non-convex learning under distributed and shared memory architectures, respectively, where N denotes the total number of training samples and $Δ$ represents the maximum delay of the workers. Moreover, we investigate the generalization performance of \algname by establishing algorithmic stability bounds for quadratic strongly convex and non-convex optimization. We further conduct extensive numerical experiments to verify our theoretical findings
FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
