
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods.
Minor changes. This version has been published in JTB. 27 pages, 9 figures
Statistics and Probability, Time-tree, Mathematics - Statistics Theory, Statistics Theory (math.ST), CAT(0), Mathematics - Metric Geometry, Modelling and Simulation, Immunology and Microbiology(all), FOS: Mathematics, Simplicial complex, Phylogeny, Medicine(all), Curvature, Agricultural and Biological Sciences(all), Biochemistry, Genetics and Molecular Biology(all), Applied Mathematics, Metric Geometry (math.MG), Posterior distribution summary, Models, Theoretical, Geodesic, Ultrametric tree, Phylogenetic inference, Phylogenetic tree
Statistics and Probability, Time-tree, Mathematics - Statistics Theory, Statistics Theory (math.ST), CAT(0), Mathematics - Metric Geometry, Modelling and Simulation, Immunology and Microbiology(all), FOS: Mathematics, Simplicial complex, Phylogeny, Medicine(all), Curvature, Agricultural and Biological Sciences(all), Biochemistry, Genetics and Molecular Biology(all), Applied Mathematics, Metric Geometry (math.MG), Posterior distribution summary, Models, Theoretical, Geodesic, Ultrametric tree, Phylogenetic inference, Phylogenetic tree
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 32 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
