An improved primal-dual approximation algorithm for the k-means problem with penalties

descriptionPublicationkeyboard_double_arrow_right Article 16 Aug 2021 English Publisher:Cambridge University Press (CUP)Journal:Mathematical Structures in Computer Science, volume 32, pages 151-163 (issn: 0960-1295, eissn: 1469-8072,

Copyright policy )

Authors: Chunying Ren; Dachuan Xu; Donglei Du; Min Li;

doi: 10.1017/s0960129521000104

An improved primal-dual approximation algorithm for the k-means problem with penalties

- Summary
- Subjects
- Metrics

Abstract

AbstractIn the k-means problem with penalties, we are given a data set $${\cal D} \subseteq \mathbb{R}^\ell $$ of n points where each point $$j \in {\cal D}$$ is associated with a penalty cost pj and an integer k. The goal is to choose a set $${\rm{C}}S \subseteq {{\cal R}^\ell }$$ with |CS| ≤ k and a penalized subset $${{\cal D}_p} \subseteq {\cal D}$$ to minimize the sum of the total squared distance from the points in D / Dp to CS and the total penalty cost of points in Dp, namely $$\sum\nolimits_{j \in {\cal D}\backslash {{\cal D}_p}} {d^2}(j,{\rm{C}}S) + \sum\nolimits_{j \in {{\cal D}_p}} {p_j}$$. We employ the primal-dual technique to give a pseudo-polynomial time algorithm with an approximation ratio of (6.357+ε) for the k-means problem with penalties, improving the previous best approximation ratio 19.849+∊ for this problem given by Feng et al. in Proceedings of FAW (2019).

Related Organizations

Beijing University of Technology
China (People's Republic of)
University of New Brunswick
Canada
Shandong Normal University
China (People's Republic of)

Keywords

linear program, $k$-means problem with penalties, JV algorithm, Approximation algorithms, approximation algorithm, Computational aspects of data analysis and big data

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Fields of Science

natural sciences

computer and information sciences

Fields of Science

natural sciences

computer and information sciences

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now