
In this paper, a batch gradient algorithm with adaptive momentum is considered and a convergence theorem is presented when it is used for two-layer feedforward neural networks training. Simple but necessary sufficient conditions are offered to guarantee both weak and strong convergence. Compared with existing general requirements, we do not restrict the error function to be quadratic or uniformly convex. A numerical example is supplied to illustrate the performance of the algorithm and support our theoretical finding.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
