Cache Oblivious Algorithms for Computing the Triplet Distance between Trees

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 08 May 2021Embargo end date: 01 Jan 2017 Germany English Publisher:Association for Computing Machinery (ACM)Journal:ACM Journal of Experimental Algorithmics, volume 26, pages 1-44 (issn: 1084-6654, eissn: 1084-6654,

Copyright policy )

Authors: Gerth Stølting Brodal; Konstantinos Mampentzidis;

doi: 10.1145/3433651 , 10.48550/arxiv.1706.10284

arXiv: 1706.10284

Cache Oblivious Algorithms for Computing the Triplet Distance between Trees

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

We consider the problem of computing the triplet distance between two rooted unordered trees with n labeled leaves. Introduced by Dobson in 1975, the triplet distance is the number of leaf triples that induce different topologies in the two trees. The current theoretically fastest algorithm is an O( n log n ) algorithm by Brodal et al. (SODA 2013). Recently, Jansson and Rajaby proposed a new algorithm that, while slower in theory, requiring O( n log 3 n ) time, in practice it outperforms the theoretically faster O( n log n ) algorithm. Both algorithms do not scale to external memory. We present two cache oblivious algorithms that combine the best of both worlds. The first algorithm is for the case when the two input trees are binary trees, and the second is a generalized algorithm for two input trees of arbitrary degree. Analyzed in the RAM model, both algorithms require O( n log n ) time, and in the cache oblivious model O( n / B log 2 n / M ) I/Os. Their relative simplicity and the fact that they scale to external memory makes them achieve the best practical performance. We note that these are the first algorithms that scale to external memory, both in theory and in practice, for this problem.

Country

Germany

Related Organizations

Aarhus University
Denmark
Aarhus University
Denmark
Schloss Dagstuhl – Leibniz Center for Informatics
Germany
Leibniz Association
Germany

Keywords

FOS: Computer and information sciences, Data structures, cache-oblivious algorithm, triplet distance, tree comparison, 004, Problems related to evolution, Computer Science - Data Structures and Algorithms, Analysis of algorithms, Data Structures and Algorithms (cs.DS), phylogenetic tree, cache oblivious algorithm, Phylogenetic tree, ddc: ddc:004

2 Research products, page 1 of 1

sparsehash-c11 software on GitHub
IsRelatedTo
CacheTD software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average