Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance

Name: Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance
Creator: Mukul S. Bansal
Keywords: Phylogenetics, Optimal phylogenetic tree completion, 0301 basic medicine, Robinson–Foulds distance, 03 medical and health sciences, QH301-705.5, Research, 0206 medical engineering, Genetics, 02 engineering and technology

Mukul S. Bansal

Found an issue? Give us feedback

Algorithms for Molec...arrow_drop_down

Algorithms for Molecular Biology

Article . 2020 . Peer-reviewed

License: CC BY

Data sources: Crossref

Algorithms for Molecular Biology

Article

License: CC BY

Data sources: UnpayWall

Algorithms for Molecular Biology

Article

Data sources: Europe PubMed Central

PubMed Central

Other literature type . 2020

Data sources: PubMed Central

Algorithms for Molecular Biology

Article . 2020

Data sources: DOAJ

DBLP

Article

Data sources: DBLP

https://dx.doi.org/10.1186/s13...

Article

Data sources: Microsoft Academic Graph

Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 13 Apr 2020 English Publisher:Springer Science and Business Media LLCJournal:Algorithms for Molecular Biology, volume 15 (eissn: 1748-7188,

Copyright policy )Funded by:NSF | CAREER: Algorithms for Do...

Authors: Mukul S. Bansal;

doi: 10.1186/s13015-020-00166-1

pmid: 32313549

pmc: PMC7155338

Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Abstract Background We consider two fundamental computational problems that arise when comparing phylogenetic trees, rooted or unrooted, with non-identical leaf sets. The first problem arises when comparing two trees where the leaf set of one tree is a proper subset of the other. The second problem arises when the two trees to be compared have only partially overlapping leaf sets. The traditional approach to handling these problems is to first restrict the two trees to their common leaf set. An alternative approach that has shown promise is to first complete the trees by adding missing leaves, so that the resulting trees have identical leaf sets. This requires the computation of an optimal completion that minimizes the distance between the two resulting trees over all possible completions. Results We provide optimal linear-time algorithms for both completion problems under the widely-used Robinson–Foulds (RF) distance measure. Our algorithm for the first problem improves the time complexity of the current fastest algorithm from quadratic (in the size of the two trees) to linear. No algorithms have yet been proposed for the more general second problem where both trees have missing leaves. We advance the study of this general problem by proposing a useful restricted version of the general problem and providing optimal linear-time algorithms for the restricted version. Our experimental results on biological data sets suggest that completion-based RF distances can be very different compared to traditional RF distances.

Related Organizations

University of Connecticut
United States

Keywords

Phylogenetics, Optimal phylogenetic tree completion, Robinson–Foulds distance, QH301-705.5, Research, Genetics, Distance measures, Biology (General), QH426-470

2 Research products, page 1 of 1

Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees
2020IsAmongTopNSimilarDocuments
A Linear Time Solution to the Labeled Robinson–Foulds Distance Problem
2020IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average