Downloads provided by UsageCounts
AbstractFederated learning (FL) is a distributed machine learning approach that enables remote devices i.e. workers to collaborate to compute the fitting of a neural network model without sharing their data. While this method is favorable to ensure data privacy, an imbalanced data distribution can introduce unfairness in the model training, causing discriminatory bias towards certain under-represented groups. In this paper, we show that imbalance federated data decreases indexes of equity i.e. differences in treatment for underrepresented classes. To address the problem, we propose a federated learning framework called Z-Fed that 1) balances the training without exchange of privacy protected data using a zero knowledge proof (ZKP) technique, and 2) allows for the collection of information on data distributions based on one or more categorical features to produce metadata about population proportions. The proposed framework infers the precise data distribution without exchanging knowledge of the data categories and uses it to coordinate a balanced training set. Z-Fed aims to mitigate the effect of imbalanced data in FL while respecting privacy and without using mediators or probabilistic approaches. Compared to a non-balanced framework, Z-Fed improves fairness and equality measured in equal opportunities (EPD) by 53.54%, equal odds (EOD) by 56.41%, and statistical parity (SPD) by 46.1% on imbalanced UTK datasets, reducing biased predictions among subgroups. EPD, EOD, and SPD measure the disparity of treatment between privileged e.g. over-represented and non-privileged groups. Given the results obtained, Z-Fed can reduce discriminatory behaviors and enhance trustworthiness of federated learning.
Zero knowledge proof, Fairness, Bias, Privacy, Federated learning
Zero knowledge proof, Fairness, Bias, Privacy, Federated learning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| downloads | 6 |

Downloads provided by UsageCounts