Tight Fpt Approximation for Socially Fair Clustering

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2022Embargo end date: 01 Jan 2021 English Publisher:Elsevier BVJournal:SSRN Electronic Journal (eissn: 1556-5068,

Copyright policy )

Authors: Dishant Goyal; Ragesh Jaiswal;

doi: 10.2139/ssrn.4226483 , 10.1016/j.ipl.2023.106383 , 10.48550/arxiv.2106.06755

arXiv: 2106.06755

Tight Fpt Approximation for Socially Fair Clustering

- Summary
- Subjects
- Metrics

Abstract

In this work, we study the socially fair $k$-median/$k$-means problem. We are given a set of points $P$ in a metric space $\mathcal{X}$ with a distance function $d(.,.)$. There are $\ell$ groups: $P_1,\dotsc,P_{\ell} \subseteq P$. We are also given a set $F$ of feasible centers in $\mathcal{X}$. The goal in the socially fair $k$-median problem is to find a set $C \subseteq F$ of $k$ centers that minimizes the maximum average cost over all the groups. That is, find $C$ that minimizes the objective function $��(C,P) \equiv \max_{j} \Big\{ \sum_{x \in P_j} d(C,x)/|P_j| \Big\}$, where $d(C,x)$ is the distance of $x$ to the closest center in $C$. The socially fair $k$-means problem is defined similarly by using squared distances, i.e., $d^{2}(.,.)$ instead of $d(.,.)$. The current best approximation guarantee for both the problems is $O\left( \frac{\log \ell}{\log \log \ell} \right)$ due to Makarychev and Vakilian [COLT 2021]. In this work, we study the fixed parameter tractability of the problems with respect to parameter $k$. We design $(3+\varepsilon)$ and $(9 + \varepsilon)$ approximation algorithms for the socially fair $k$-median and $k$-means problems, respectively, in FPT (fixed parameter tractable) time $f(k,\varepsilon) \cdot n^{O(1)}$, where $f(k,\varepsilon) = (k/\varepsilon)^{{O}(k)}$ and $n = |P \cup F|$. Furthermore, we show that if Gap-ETH holds, then better approximation guarantees are not possible in FPT time.

The new version gives tight approximation results. However, the old version uses techniques that work in the streaming setting albeit at the cost of weaker approximation guarantees. So, readers interested in the streaming setting may want to see the older version

Related Organizations

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Classification and discrimination; cluster analysis (statistical aspects), Parameterized complexity, tractability and kernelization, fairness, Approximation algorithms, Machine Learning (cs.LG), fixed-parameter tractability, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), $k$-means, approximation algorithms, clustering

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

8

Top 10%

Average

Top 10%

Green

Fields of Science

natural sciences

computer and information sciences

Fields of Science

natural sciences

computer and information sciences