Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC 0
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC 0
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC 0
Data sources: Datacite
versions View all 2 versions
addClaim

Simulated metagenomic DNA sequencing reads for complete NCBI RefSeq virus sequences

Authors: Constantinides, Bede;

Simulated metagenomic DNA sequencing reads for complete NCBI RefSeq virus sequences

Abstract

This dataset comprises simulated DNA sequencing reads in FASTQ format for 17,900 complete NCBI RefSeq virus sequences downloaded on 2025-02-05. Included are simulated long Oxford Nanopore Technologies R10.4 reads with ~4% error rate and simulated short (2x150bp) Illumina reads with 1% error rate. The source reference sequences are provided as rsviruses17900.fa.gz. Simulated long reads (Oxford Nanopore Technologies) rsviruses17900.fastq.gz Measured empirical error rate: ~4% Simulator: PBSIM 3.0.4 (https://academic.oup.com/nargab/article/4/4/lqac092/6855700) Model: ERRHMM-ONT-HQ Depth: 10x Mean read length: 1,000bp Max read length: 10,000bp Mean accuracy: 0.98 Random seed: 1 Command used: for fasta in rsviruses17900/*.fa; do acc=$(basename "$fasta" .fa) pbsim --seed 1 --strategy wgs --method errhmm --errhmm pbsim3/data/ERRHMM-ONT-HQ.model --depth 10 --genome ${fasta} --prefix ${acc} --id-prefix ${acc}__ --length-mean 1000 --length-max 10000 --accuracy-mean 0.98; cat ${acc}*.fastq | pigz > ${acc}.fastq.gzdone Simulated short reads (Illumina) rsviruses17900.r1.fastq.gz and rsviruses17900.r2.fastq.gz Measured empirical error rate: 1% Simulator: dwgsim 0.1.14; conda package version 1.1.14, (https://github.com/nh13/DWGSIM) Read length: 2x150bp (paired) Depth: 10x Random read probability (-y): 0 Error rate (-e and -E): 0.01 Mutation rate (-r): 0.0 Of which low frequency somatic mutations (-F): 0.0 Random seed (-z): 1 Command used: for fasta in rsviruses17900/*.fa; do acc=$(basename "$fasta" .fa) dwgsim -C 10 -1 150 -2 150 -y 0.0 -o 1 -z 1 -F 0.0 -r 0.0 -e 0.01 -E 0.01 "$fasta" "$acc"done

Related Organizations
Keywords

Metagenomics, Genomics

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities