
arXiv: 1406.1780
Mode clustering is a nonparametric method for clustering that defines clusters using the basins of attraction of a density estimator's modes. We provide several enhancements to mode clustering: (i) a soft variant of cluster assignment, (ii) a measure of connectivity between clusters, (iii) a technique for choosing the bandwidth, (iv) a method for denoising small clusters, and (v) an approach to visualizing the clusters. Combining all these enhancements gives us a complete procedure for clustering in multivariate problems. We also compare mode clustering to other clustering methods in several examples
34 pages, 17 figures. Accepted to the Electronic Journal of Statistics. The original title is "Enhanced Mode Clustering"
nonparametric clustering, FOS: Computer and information sciences, Kernel density estimation, Classification and discrimination; cluster analysis (statistical aspects), Machine Learning (stat.ML), Nonparametric inference, mean shift clustering, Methodology (stat.ME), Density estimation, soft clustering, Statistics - Machine Learning, 62H30 (Primary), 62G07, 62G99 (Secondary), 62G07, kernel density estimation, 62G99, 62H30, visualization, Statistics - Methodology
nonparametric clustering, FOS: Computer and information sciences, Kernel density estimation, Classification and discrimination; cluster analysis (statistical aspects), Machine Learning (stat.ML), Nonparametric inference, mean shift clustering, Methodology (stat.ME), Density estimation, soft clustering, Statistics - Machine Learning, 62H30 (Primary), 62G07, 62G99 (Secondary), 62G07, kernel density estimation, 62G99, 62H30, visualization, Statistics - Methodology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 32 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
