
arXiv: 2103.01470
Since network data commonly consists of observations from a single large network, researchers often partition the network into clusters in order to apply cluster‐robust inference methods. Existing such methods require clusters to be asymptotically independent. Under mild conditions, we prove that, for this requirement to hold for network‐dependent data, it is necessary and sufficient that clusters have low conductance, the ratio of edge boundary size to volume. This yields a simple measure of cluster quality. We find in simulations that when clusters have low conductance, cluster‐robust methods control size better than HAC estimators. However, for important classes of networks lacking low‐conductance clusters, the former can exhibit substantial size distortion. To determine the number of low‐conductance clusters and construct them, we draw on results in spectral graph theory that connect conductance to the spectrum of the graph Laplacian. Based on these results, we propose to use the spectrum to determine the number of low‐conductance clusters and spectral clustering to construct them.
FOS: Computer and information sciences, social networks, spectral clustering, Classification and discrimination; cluster analysis (statistical aspects), Graphs and linear algebra (matrices, eigenvalues, etc.), Random graphs (graph-theoretic aspects), Econometrics (econ.EM), Methodology (stat.ME), FOS: Economics and business, Applications of statistics to economics, clustered standard errors, Statistics - Methodology, Social networks; opinion dynamics, Economics - Econometrics
FOS: Computer and information sciences, social networks, spectral clustering, Classification and discrimination; cluster analysis (statistical aspects), Graphs and linear algebra (matrices, eigenvalues, etc.), Random graphs (graph-theoretic aspects), Econometrics (econ.EM), Methodology (stat.ME), FOS: Economics and business, Applications of statistics to economics, clustered standard errors, Statistics - Methodology, Social networks; opinion dynamics, Economics - Econometrics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
