PASTA for proteins

descriptionPublicationkeyboard_double_arrow_right Article 19 Jun 2018 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics, volume 34, pages 3,939-3,941 (issn: 1367-4803, eissn: 1367-4811,

Copyright policy )Funded by:NSF | ABI Innovation: New metho...

Authors: Kodi Collins; Tandy J. Warnow;

doi: 10.1093/bioinformatics/bty495

pmid: 29931282

pmc: PMC6223367

PASTA for proteins

- Summary
- Subjects
- Metrics

Abstract

AbstractSummaryPASTA is a multiple sequence method that uses divide-and-conquer plus iteration to enable base alignment methods to scale with high accuracy to large sequence datasets. By default, PASTA included MAFFT L-INS-i; our new extension of PASTA enables the use of MAFFT G-INS-i, MAFFT Homologs, CONTRAlign and ProbCons. We analyzed the performance of each base method and PASTA using these base methods on 224 datasets from BAliBASE 4 with at least 50 sequences. We show that PASTA enables the most accurate base methods to scale to larger datasets at reduced computational effort, and generally improves alignment and tree accuracy on the largest BAliBASE datasets.Availability and implementationPASTA is available at https://github.com/kodicollins/pasta and has also been integrated into the original PASTA repository at https://github.com/smirarab/pasta.Supplementary informationSupplementary data are available at Bioinformatics online.

Related Organizations

University of California
United States
University of California
United States
University of Illinois
United States
University of Illinois at Urbana-Champaign
United States
University of Illinois at Urbana Champaign
United States

View all View all

Keywords

Computational Biology, Proteins, Databases, Protein, Applications Notes, Sequence Alignment, Algorithms, Software

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average