
AbstractMulti-task parallel algorithm is applied to accelerate the program of DNA Sequence Reads Compressor (DSRC), which is specialized for compressing DNA sequencing data file - FASTQ format file. The compression process is first divided into two parallel tasks: data process task and data input task. Then, the data process task is further divided into two parts: Title data process part and other data process part. Three high parallelism models for the Title data processing are developed: multi-CPU model, CPU+GPU model and CPU+MIC model. All three models are tested, and the result shows that nearly 3x acceleration can be achieved. The maximum throughput observed among all tested cases is 144MB/S, and the average throughput is 120MB/S. Research finds that the acceleration ratio depends on the types of sequencing platforms. The program performs better on data from the ILLUMINA and SOLiD sequencing platforms.
FASTQ format file, Multi-task parallel, GPU, Compression, MIC, Multithread
FASTQ format file, Multi-task parallel, GPU, Compression, MIC, Multithread
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
