
pmid: 16761921
BLAST is the most popular bioinformatics tool and is used to run millions of queries each day. However, evaluating such queries is slow, taking typically minutes on modern workstations. Therefore, continuing evolution of BLAST--by improving its algorithms and optimizations--is essential to improve search times in the face of exponentially increasing collection sizes. We present an optimization to the first stage of the BLAST algorithm specifically designed for protein search. It produces the same results as NCBI-BLAST but in around 59% of the time on Intel-based platforms; we also present results for other popular architectures. Overall, this is a saving of around 15% of the total typical BLAST search time. Our approach uses a deterministic finite automaton (DFA), inspired by the original scheme used in the 1990 BLAST algorithm. The techniques are optimized for modern hardware, making careful use of cache-conscious approaches to improve speed. Our optimized DFA approach has been integrated into a new version of BLAST that is freely available for download at http://www.fsa-blast.org/.
Sequence Homology, Amino Acid, Computational Biology, Sequence Alignment, Algorithms, Software
Sequence Homology, Amino Acid, Computational Biology, Sequence Alignment, Algorithms, Software
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 25 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
