descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Research , Report 01 Aug 2019Embargo end date: 01 Jan 2017Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Information Theory, volume 65, pages 4,875-4,892 (issn: 0018-9448, eissn: 1557-9654,

Authors: Kirill Efimov; Larisa Adamyan; Vladimir Spokoiny;

doi: 10.1109/tit.2019.2903113 , 10.48550/arxiv.1709.09102

arXiv: 1709.09102

handle: 10419/230729

Adaptive Nonparametric Clustering

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

This paper presents a new approach to non-parametric cluster analysis called Adaptive Weights Clustering (AWC). The idea is to identify the clustering structure by checking at different points and for different scales on departure from local homogeneity. The proposed procedure describes the clustering structure in terms of weights \( w_{ij} \) each of them measures the degree of local inhomogeneity for two neighbor local clusters using statistical tests of "no gap" between them. % The procedure starts from very local scale, then the parameter of locality grows by some factor at each step. The method is fully adaptive and does not require to specify the number of clusters or their structure. The clustering results are not sensitive to noise and outliers, the procedure is able to recover different clusters with sharp edges or manifold structure. The method is scalable and computationally feasible. An intensive numerical study shows a state-of-the-art performance of the method in various artificial examples and applications to text data. Our theoretical study states optimal sensitivity of AWC to local inhomogeneity.

Related Organizations

Leibniz Association
Germany
Humboldt-Universität zu Berlin
Germany
Weierstrass Institute for Applied Analysis and Stochastics
Germany
Weierstrass Institute
Germany

Keywords

ddc:510, FOS: Computer and information sciences, manifold clustering, 330, ddc:330, article, Machine Learning (stat.ML), adaptive weights -- clustering -- gap coefficient -- manifold clustering, 510, Primary 62H30, Secondary 62G10, Statistics - Machine Learning, adaptive weights, gap coecient, Adaptive weights -- clustering -- Kullback-Leibler -- manifold detection, 62H30, C00, 62G10, clustering

1 Research products, page 1 of 1

clustering-benchmark software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%