publication . Article . Other literature type . 2009

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

Morgan Price;
Open Access English
  • Published: 17 Apr 2009 Journal: Molecular Biology and Evolution, volume 26, issue 7, pages 1,641-1,650 (issn: 0737-4038, eissn: 1537-1719, Copyright policy)
  • Publisher: Oxford University Press
Abstract
Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, ...
Subjects
free text keywords: Research Articles, minimum evolution, Neighbor-Joining, large phylogenies, k-nearest neighbors algorithm, Biology, Speedup, Pairwise comparison, Algorithm, Neighbor joining, Distance matrix, Bioinformatics, Bootstrapping, Genetics, Joins, Network topology
33 references, page 1 of 3

Alm, EJ, Huang, KH, Price, MN, Koche, RP, Keller, K, Dubchak, IL, Arkin, AP. The MicrobesOnline Web site for comparative genomics. Genome Res.. 2005; 15: 1015-1022 [OpenAIRE] [PubMed]

Bininda-Emonds, OR, Brady, SG, Kim, J, Sanderson, MJ. Scaling of accuracy in extremely large phylogenetic trees. Pac Symp Biocomput. 2001; 2001: 547-558 [OpenAIRE] [PubMed]

DeLong, ER, Clarke-Pearson, DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1998; 44: 837-845 [PubMed]

DeSantis, TZ, Hugenholtz, P, Larsen, N, Rojas, M, Brodie, EL, Keller, K, Huber, T, Dalevi, D, Hu, P, Andersen, GL. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol.. 2006; 72: 5069-5072 [OpenAIRE] [PubMed]

Desper, R, Gascuel, O. Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol.. 2002; 9: 687-705 [PubMed]

Efron, B, Halloran, E, Holmes, S. Bootstrap confidence levels for phylogenetic trees. Proc Natl Acad Sci USA. 1996; 93: 13429-13434 [OpenAIRE] [PubMed]

Eisen, JA. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res.. 1998; 8: 163-167 [PubMed]

Elias, I, Lagergren, J. Fast neighbor joining. In: Proceedings of the 32nd International Colloquium on Automata. Languages and Programming (ICALP'05) Lecture Notes in Computer Science. 2005; 3580: 1263-1274

Engelhardt, BE, Jordan, MI, Muratore, KE, Brenner, SE. Protein molecular function prediction by Bayesian phylogenomics. PLoS Comput Biol. 1:e45. 2005

Evans, J, Sheneman, L, Foster, J. Relaxed neighbor joining: a fast distance-based phylogenetic tree construction method. J Mol Evol.. 2006; 62: 785-792 [OpenAIRE] [PubMed]

Felsenstein, J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985; 39: 783-791

Finn, RD, Mistry, J, Schuster-Böckler, B. Pfam: clans, web tools and services. Nucleic Acids Res.. 2006; 34: D247-D251 [OpenAIRE]

Gascuel, O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol.. 1997; 14: 685-695 [OpenAIRE] [PubMed]

Gascuel, O, Steel, M. Neighbor-joining revealed. Mol Biol Evol.. 2006; 23: 1997-2000 [OpenAIRE] [PubMed]

Guindon, S, Gascuel, O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol.. 2003; 52: 696-704 [OpenAIRE] [PubMed]

33 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Article . Other literature type . 2009

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

Morgan Price;