
arXiv: 2508.21286
Federated Learning (FL) is a communication-efficient distributed machine learning method that allows multiple devices to collaboratively train models without sharing raw data. FL can be categorized into centralized and decentralized paradigms. The centralized paradigm relies on a central server to aggregate local models, potentially resulting in single points of failure, communication bottlenecks, and exposure of model parameters. In contrast, the decentralized paradigm, which does not require a central server, provides improved robustness and privacy. The essence of federated learning lies in leveraging multiple local updates for efficient communication. However, this approach may result in slower convergence or even convergence to suboptimal models in the presence of heterogeneous and imbalanced data. To address this challenge, we study decentralized federated averaging via random walk (DFedRW), which replaces multiple local update steps on a single device with random walk updates. Traditional Federated Averaging (FedAvg) and its decentralized versions commonly ignore stragglers, which reduces the amount of training data and introduces sampling bias. Therefore, we allow DFedRW to aggregate partial random walk updates, ensuring that each computation contributes to the model update. To further improve communication efficiency, we also propose a quantized version of DFedRW. We demonstrate that (quantized) DFedRW achieves convergence upper bound of order $\mathcal{O}(\frac{1}{k^{1-q}})$ under convex conditions. Furthermore, we propose a sufficient condition that reveals when quantization balances communication and convergence. Numerical analysis indicates that our proposed algorithms outperform (decentralized) FedAvg in both convergence rate and accuracy, achieving a 38.3\% and 37.5\% increase in test accuracy under high levels of heterogeneities.
FOS: Computer and information sciences, Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)
FOS: Computer and information sciences, Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
