• shareshare
  • link
  • cite
  • add
Publication . Conference object . 2007

High-throughput computation of pairwise sequence similarities for multiple genome comparisons using ScalaBLAST

A.R. Shah; V.M. Markowitz; Christopher S. Oehmen;
Published: 18 Dec 2007
Publisher: IEEE
Genome sequence comparisons of exponentially growing data sets form the foundation for the comparative analysis tools provided by community biological data resources such as the integrated microbial genome (IMG) system at the joint genome institute (JGI). For a genome sequencing center to provide multiple-genome comparison capabilities, it must keep pace with exponentially growing collection of sequence data, both from its own genomes, and from public genomes. We present an example of how ScalaBLAST, a high-throughput sequence analysis program, harnesses increasingly critical high-performance computing to perform sequence analysis, enabling, for example, all vs. all BLAST runs across 2 million protein sequences within a day using thousands of processors as opposed to conventional comparison methods that would take years to complete.
Subjects by Vocabulary

Microsoft Academic Graph classification: Sequence analysis Genetics Genome Whole genome sequencing Pairwise comparison Biological data Sequence (medicine) DNA sequencing Computational biology Biology Alignment-free sequence analysis

ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION

Related Organizations
Download from
Conference object . 2007
Providers: Crossref