
arXiv: 1509.08863
We build upon recent advances in graph signal processing to propose a faster spectral clustering algorithm. Indeed, classical spectral clustering is based on the computation of the first k eigenvectors of the similarity matrix' Laplacian, whose computation cost, even for sparse matrices, becomes prohibitive for large datasets. We show that we can estimate the spectral clustering distance matrix without computing these eigenvectors: by graph filtering random signals. Also, we take advantage of the stochasticity of these random vectors to estimate the number of clusters k. We compare our method to classical spectral clustering on synthetic data, and show that it reaches equal performance while being faster by a factor at least two for large datasets.
Social and Information Networks (cs.SI), FOS: Computer and information sciences, [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, Spectral clustering, FOS: Mathematics, graph filtering, Computer Science - Social and Information Networks, Mathematics - Numerical Analysis, Numerical Analysis (math.NA), graph signal processing
Social and Information Networks (cs.SI), FOS: Computer and information sciences, [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, Spectral clustering, FOS: Mathematics, graph filtering, Computer Science - Social and Information Networks, Mathematics - Numerical Analysis, Numerical Analysis (math.NA), graph signal processing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 16 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
