
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>In this paper we address the problem of testing whether two observed trees $(t,t')$ are sampled either independently or from a joint distribution under which they are correlated. This problem, which we refer to as correlation detection in trees, plays a key role in the study of graph alignment for two correlated random graphs. Motivated by graph alignment, we investigate the conditions of existence of one-sided tests, i.e. tests which have vanishing type I error and non-vanishing power in the limit of large tree depth. For the correlated Galton-Watson model with Poisson offspring of mean $λ>0$ and correlation parameter $s \in (0,1)$, we identify a phase transition in the limit of large degrees at $s = \sqrtα$, where $α\sim 0.3383$ is Otter's constant. Namely, we prove that no such test exists for $s \leq \sqrtα$, and that such a test exists whenever $s > \sqrtα$, for $λ$ large enough. This result sheds new light on the graph alignment problem in the sparse regime (with $O(1)$ average node degrees) and on the performance of the MPAlign method studied in Ganassali et al. (2021), Piccioli et al. (2021), proving in particular the conjecture of Piccioli et al. (2021) that MPAlign succeeds in the partial recovery task for correlation parameter $s>\sqrtα$ provided the average node degree $λ$ is large enough. As a byproduct, we identify a new family of orthogonal polynomials for the Poisson-Galton-Watson measure which enjoy remarkable properties. These polynomials may be of independent interest for a variety of problems involving graphs, trees or branching processes, beyond the scope of graph alignment.
28 pages, 2 figures, to appear in Annals of Applied Probability
FOS: Computer and information sciences, Probability (math.PR), FOS: Physical sciences, Mathematics - Statistics Theory, Machine Learning (stat.ML), [MATH] Mathematics [math], Statistics Theory (math.ST), 004, 510, Statistics - Machine Learning, Physics - Data Analysis, Statistics and Probability, FOS: Mathematics, [MATH]Mathematics [math], Mathematics - Probability, Data Analysis, Statistics and Probability (physics.data-an)
FOS: Computer and information sciences, Probability (math.PR), FOS: Physical sciences, Mathematics - Statistics Theory, Machine Learning (stat.ML), [MATH] Mathematics [math], Statistics Theory (math.ST), 004, 510, Statistics - Machine Learning, Physics - Data Analysis, Statistics and Probability, FOS: Mathematics, [MATH]Mathematics [math], Mathematics - Probability, Data Analysis, Statistics and Probability (physics.data-an)
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
