
arXiv: 2003.13462
The stochastic blockmodel (SBM) models the connectivity within and between disjoint subsets of nodes in networks. Prior work demonstrated that the rows of an SBM's adjacency spectral embedding (ASE) and Laplacian spectral embedding (LSE) both converge in law to Gaussian mixtures where the components are curved exponential families. Maximum likelihood estimation via the Expectation-Maximization (EM) algorithm for a full Gaussian mixture model (GMM) can then perform the task of clustering graph nodes, albeit without appealing to the components' curvature. Noting that EM is a special case of the Expectation-Solution (ES) algorithm, we propose two ES algorithms that allow us to take full advantage of these curved structures. After presenting the ES algorithm for the general curved-Gaussian mixture, we develop those corresponding to the ASE and LSE limiting distributions. Simulating from artificial SBMs and a brain connectome SBM reveals that clustering graph nodes via our ES algorithms can improve upon that of EM for a full GMM for a wide range of settings.
45 pages, version accepted by Electronic Journal of Statistics
FOS: Computer and information sciences, curved exponential family, mixture model, Classification and discrimination; cluster analysis (statistical aspects), Statistics - Applications, Applications of statistics to biology and medical sciences; meta analysis, Methodology (stat.ME), Random matrices (probabilistic aspects), 62-08 (Primary) 62P15, 62P10 (Secondary), estimating equations, Applications (stat.AP), EM algorithm, Statistics - Methodology, random graph
FOS: Computer and information sciences, curved exponential family, mixture model, Classification and discrimination; cluster analysis (statistical aspects), Statistics - Applications, Applications of statistics to biology and medical sciences; meta analysis, Methodology (stat.ME), Random matrices (probabilistic aspects), 62-08 (Primary) 62P15, 62P10 (Secondary), estimating equations, Applications (stat.AP), EM algorithm, Statistics - Methodology, random graph
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
