
Graph spectral clustering algorithms have been shown to be effective in finding clusters and generally outperform traditional clustering algorithms, such as k-means. However, they have scalibility issues in both memory usage and computational time. To overcome these limitations, the common approaches sparsify the similarity matrix by zeroing out some of its elements. They generally consider local neighborhood relationships between the data instances such as the k-nearest neighbor method. Although, they eventually work with the Laplacian matrix, there is no evidence about preservation of its spectral properties with approximation guarantees. As a result, in this paper, we adopt the idea of spectral sparsification to sparsify the Laplacian matrix. A spectral sparsification algorithm takes a dense graph G with n vertices and m edges (that is usually O(n2)), and returns a new graph H with the same set of vertices and many fewer edges, on the order of O(n log n), that approximately preserves the spectral properties of the input graph. We study the effect of the spectral sparsification method based on sampling by effective resistance on the spectral clustering outputs. Through experiments, we show that the clustering results obtained from sparsified graphs are very similar to the results of the original non-sparsified graphs.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 11 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
