Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 4 versions
addClaim

Test data for nf-updhmm nextflow pipeline

Authors: Sevilla Porras, Marta; Ruiz-Arenas, Carlos;

Test data for nf-updhmm nextflow pipeline

Abstract

nf-UPDhmm Test Data and Reference Files The dataset provided for the nf-UPDhmm pipeline includes both test VCF files and reference BED files required for preprocessing and execution. 1. Test data – VCF files VCF files from 1000 Genomes Project. These samples were extracted from the publicly available phased SNV/INDEL VCFs release (https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/). mother.vcf.gz: HG00404 father.vcf.gz: HG00403 proband_control.vcf.gz: HG00405 proband_heterodisomy.vcf.gz: HG00405 with a simulated heterodisomy event introduced. Note: To ensure computational efficiency, all VCFs are restricted to this region (chr21:29222885-34430153). This allows the pipeline test to run quickly while preserving the structure of a real trio dataset. 2. Reference files – BEDs In addition to the test VCFs, we provide BED files that define regions excluded during preprocessing. These are reference files, not test data, but they are required for the correct execution of the nf-UPDhmm pipeline. The files are organized by reference genome version: -prefix "hg19" -prefix "hg38" centromeres.bed Centromeric and pericentromeric regions (±2 Mb). Sources: hg19: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/cytoBand.txt.gz hg38: https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/centromeres.txt.gz segmental_duplications.bed Annotated segmental duplications. Sources: hg19: https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/genomicSuperDups.txt.gz hg38: https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/genomicSuperDups.txt.gz hla_kir.bed Highly polymorphic immune-related loci. Coordinates used: hg19: HLA: chr6:28,477,797–33,448,354 KIR: chr19:55,228,188–55,383,188 hg38: HLA: chr6:28,510,120–33,480,577 KIR: chr19:54,025,634–55,084,318 excluded_regions.bed A combined file merging all of the above (centromeres, segmental duplications, HLA, and KIR).

3. SFARI BED annotation file (UPDhmm package) - SFARI.bed: This BED file contains curated recurrent regions derived from the analysis on SSC cohort, integrated into the UPDhmm workflow to assist with downstream interpretation of detected uniparental disomies events (UPDs).

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average