Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure

Name: Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure
Creator: Marek Gagolewski
Keywords: FOS: Computer and information sciences, Classification and discrimination; cluster analysis (statistical aspects), accuracy, adjusted Rand index, 05 social sciences, Machine Learning (stat.ML), 02 engineering and technology, normalisation, Machine Learning (cs.LG), Machine Learning

Marek Gagolewski

Found an issue? Give us feedback

Journal of Classific...arrow_drop_down

Journal of Classification

Article . 2024 . Peer-reviewed

License: CC BY

Data sources: Crossref

arXiv.org e-Print Archive

Preprint . 2022

Data sources: arXiv.org e-Print Archive

zbMATH Open

Article . 2025

Data sources: zbMATH Open

https://dx.doi.org/10.48550/ar...

Article . 2022

License: CC BY

Data sources: Datacite

DBLP

Article . 2025

Data sources: DBLP

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure

Normalised clustering accuracy: an asymmetric external cluster validity measure

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 28 Jun 2024Embargo end date: 01 Jan 2022 English Publisher:Springer Science and Business Media LLCJournal:Journal of Classification, volume 42, pages 2-30 (issn: 0176-4268, eissn: 1432-1343,

Copyright policy )Funded by:ARC | Discovery Projects - Gran...

Authors: Marek Gagolewski;

doi: 10.1007/s00357-024-09482-2 , 10.48550/arxiv.2209.02935

arXiv: 2209.02935

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure

- Summary
- Subjects
- Metrics

Abstract

Abstract There is no, nor will there ever be, single best clustering algorithm. Nevertheless, we would still like to be able to distinguish between methods that work well on certain task types and those that systematically underperform. Clustering algorithms are traditionally evaluated using either internal or external validity measures. Internal measures quantify different aspects of the obtained partitions, e.g., the average degree of cluster compactness or point separability. However, their validity is questionable because the clusterings they endorse can sometimes be meaningless. External measures, on the other hand, compare the algorithms’ outputs to fixed ground truth groupings provided by experts. In this paper, we argue that the commonly used classical partition similarity scores, such as the normalised mutual information, Fowlkes–Mallows, or adjusted Rand index, miss some desirable properties. In particular, they do not identify worst-case scenarios correctly, nor are they easily interpretable. As a consequence, the evaluation of clustering algorithms on diverse benchmark datasets can be difficult. To remedy these issues, we propose and analyse a new measure: a version of the optimal set-matching accuracy, which is normalised, monotonic with respect to some similarity relation, scale-invariant, and corrected for the imbalancedness of cluster sizes (but neither symmetric nor adjusted for chance).

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Classification and discrimination; cluster analysis (statistical aspects), accuracy, adjusted Rand index, Machine Learning (stat.ML), normalisation, Machine Learning (cs.LG), Machine Learning, external cluster validity, optimal set matching, mutual information, clustering

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Green

hybrid

Fields of Science (4) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all

Funded by

ARC| Discovery Projects - Grant ID: DP210100227