Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2025
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2026
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Software . 2026
License: CC BY
Data sources: Datacite
ZENODO
Software . 2025
License: CC BY
Data sources: Datacite
ZENODO
Software . 2026
License: CC BY
Data sources: Datacite
ZENODO
Software . 2024
License: CC BY
Data sources: Datacite
ZENODO
Software . 2026
License: CC BY
Data sources: Datacite
versions View all 5 versions
addClaim

github.com/PacificBiosciences/HiFi-human-WGS-WDL/HiFi-human-WGS-WDL-family

Authors: Juniper Ada Lake; Gregory Concepcion; Heather Ward; William Rowell; Karen Fang; Aaron Wenger;

github.com/PacificBiosciences/HiFi-human-WGS-WDL/HiFi-human-WGS-WDL-family

Abstract

family.wdl inputs and outputs family.wdl inputs and outputs DAG (simplified) Inputs Family Struct Sample Struct Outputs Alignments, Coverage, and QC Small Variants ( bait_fasta --> bait_fail_reads fail_ubam --> bait_fail_reads --> pbmm2_align_fail_reads --> filter_fail_reads --> samtools_merge_fail_reads ubam --> pbmm2_align --> samtools_merge samtools_merge --> mosdepth samtools_merge --> paraphase samtools_merge --> mitorsaw samtools_merge_fail_reads --> trgt samtools_merge --> deepvariant samtools_merge --> sawfish_discover samtools_merge --> hiphase deepvariant --> sawfish_discover deepvariant --> glnexus sawfish_discover --> sawfish_call glnexus --> split_glnexus sawfish_call --> split_sawfish split_glnexus --> hiphase split_sawfish --> hiphase hiphase --> trgt hiphase --> bam_stats hiphase --> bcftools_roh hiphase --> bcftools_stats hiphase --> sv_stats hiphase --> cpg_pileup hiphase --> starphase hiphase --> pharmcat hiphase --> trgt_dropouts starphase --> pharmcat cpg_pileup --> methbat hiphase --> merge_small_variants hiphase --> merge_svs hiphase --> trgt_merge merge_small_variants --> slivar_small_variants merge_svs --> svpack svpack --> slivar_svpack Inputs | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Family | family | Family struct describing samples, relationships, and unaligned BAM paths | below | | File | ref_map_file | TSV containing reference genome file paths; must match backend | | | String? | phenotypes | Comma-delimited list of HPO terms. | Human Phenotype Ontology (HPO) phenotypes associated with the cohort.If omitted, tertiary analysis will be skipped. | | File? | tertiary_map_file | TSV containing tertiary analysis file paths and thresholds; must match backend | AF/AC/nhomalt thresholds can be modified, but this will affect performance.If omitted, tertiary analysis will be skipped. | | Int? | glnexus_mem_gb | Override GLnexus memory; optional | | | Int? | pbsv_call_mem_gb | Override PBSV call memory; optional | | | Boolean | gpu | Use GPU when possibleDefault: false | GPU support | | String | backend | Backend where the workflow will be executed["GCP", "Azure", "AWS-HealthOmics", "HPC"] | | | String? | zones | Zones where compute will take place; required if backend is set to 'AWS' or 'GCP'. | Determining available zones in GCP | | String? | cpuPlatform | Minimum CPU platform to use for tasks on GCP | Optional, only necessary in certain zones lacking n1 nodes. | | String? | gpuType | GPU type to use; required if gpu is set to true for cloud backends; must match backend | Available GPU types | | String? | container_registry | Container registry where workflow images are hosted.Default: "quay.io/pacbio" | If omitted, PacBio's public Quay.io registry will be used.Custom container_registry must be set if backend is set to 'AWS-HealthOmics'. | | Boolean | preemptible | Where possible, run tasks preemptibly[true, false]Default: true | If set to true, run tasks preemptibly where possible. If set to false, on-demand VMs will be used for every task. Ignored if backend is set to HPC. | Family Struct The Family struct contains the samples for the family. The struct has the following fields: | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | String | family_id | Unique identifier for the family | Alphanumeric characters, periods, dashes, and underscores are allowed. | | Array[Sample] | samples | Sample struct with sample specific data and metadata. | below | Sample Struct The Sample struct contains sample specific data and metadata. The struct has the following fields: | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | String | sample_id | Unique identifier for the sample | Alphanumeric characters, periods, dashes, and underscores are allowed. | | String? | sex | Sample sex["MALE", "FEMALE", null] | Used by HiFiCNV and TRGT for genotyping. Allosome karyotype will default to XX unless sex is specified as "MALE". Used for tertiary analysis X-linked inheritance filtering. | | Boolean | affected | Affected status | If set to true, sample is described as being affected by all HPO terms in phenotypes.If set to false, sample is described as not being affected by all HPO terms in phenotypes. | | Array[File] | hifi_reads | Array of paths to HiFi reads in unaligned BAM format. | | | Array[File]? | fail_reads | Array of paths to failed HiFi reads in unaligned BAM format (optional) | If provided, these reads will be aligned to the bait-captured regions. | | String? | father_id | sample_id of father (optional) | | | String? | mother_id | sample_id of mother (optional) | | Outputs Alignments, Coverage, and QC | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | String | workflow_name | Workflow name | | | String | workflow_version | Workflow version | | | Array[String] | msg | Messages from the workflow | | | File | msg_file | File containing messages from the workflow | | | Array[String] | sample_ids | Sample IDs | | | File | stats_file | Table of summary statistics | | | Array[File] | bam_stats | BAM stats | Per-read length and read-quality | | Array[File] | read_length_plot | Read length plot | | | Array[File?] | read_quality_plot | Read quality plot | | | Array[File] | merged_haplotagged_bam | Merged, haplotagged alignments | Includes unmapped reads | | Array[File] | merged_haplotagged_bam_index | | | | Array[File] | mosdepth_summary | Summary of aligned read depth. | | | Array[File] | mosdepth_region_bed | Median aligned read depth by 500bp windows. | | | Array[File] | mosdepth_region_bed_index | | | | Array[File] | mosdepth_depth_distribution_plot | | | | Array[File] | mapq_distribution_plot | Distribution of mapping quality per alignment | | | Array[File] | mg_distribution_plot | Distribution of gap-compressed identity score per alignment | | | Array[String] | stat_read_count | Number of reads | | | Array[String] | stat_read_length_mean | Mean read length | | | Array[String] | stat_read_length_median | Median read length | | | Array[String] | stat_read_length_n50 | Read length N50 | | | Array[String] | stat_read_quality_mean | Mean read quality | | | Array[String] | stat_read_quality_median | Median read quality | | | Array[String] | stat_mapped_read_count | Count of reads mapped to reference | | | Array[String] | stat_mapped_read_percent | Percent of reads mapped to reference | | | Array[String] | stat_gap_compressed_identity_mean | Mean gap-compressed identity | | | Array[String] | stat_gap_compressed_identity_median | Median gap-compressed identity | | | Array[String] | inferred_sex | Inferred sex | Sex is inferred based on relative depth of chrY alignments. | | Array[String] | stat_depth_mean | Mean depth | | Small Variants (<50 bp) | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | phased_small_variant_vcf | Phased small variant VCF | | | Array[File] | phased_small_variant_vcf_index | | | | Array[File] | small_variant_gvcf | Small variant GVCF | Can be used for joint-calling. | | Array[File] | small_variant_gvcf_index | | | | Array[File] | small_variant_stats | Small variant stats | Generated by bcftools stats. | | Array[String] | stat_small_variant_SNV_count | SNV count | (PASS variants) | | Array[String] | stat_small_variant_INDEL_count | INDEL count | (PASS variants) | | Array[String] | stat_small_variant_TSTV_ratio | Ts/Tv ratio | (PASS variants) | | Array[String] | stat_small_variant_HETHOM_ratio | Het/Hom ratio | (PASS variants) | | Array[File] | snv_distribution_plot | Distribution of SNVs by REF, ALT | | | Array[File] | indel_distribution_plot | Distribution of indels by size | | | File? | joint_small_variants_vcf | Joint-called small variant VCF | | | File? | joint_small_variants_vcf_index | | | Structural Variants (≥50 bp) | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | phased_sv_vcf | Phased structural variant VCF | | | Array[File] | phased_sv_vcf_index | Index for phased structural variant VCF | | | Array[String] | stat_sv_DUP_count | Structural variant DUP count | (PASS variants) | | Array[String] | stat_sv_DEL_count | Structural variant DEL count | (PASS variants) | | Array[String] | stat_sv_INS_count | Structural variant INS count | (PASS variants) | | Array[String] | stat_sv_INV_count | Structural variant INV count | (PASS variants) | | Array[String] | stat_sv_BND_count | Structural variant BND count | (PASS variants) | | Array[String] | stat_sv_SWAP_count | Structural variant sequence swap events | (PASS variants) | | File | sv_supporting_reads | Supporting reads for structural variants | | | Array[File] | sv_copynum_bedgraph | CNV copy number BEDGraph | | | Array[File] | sv_depth_bw | CNV depth BigWig | | | Array[File] | sv_gc_bias_corrected_depth_bw | CNV GC-bias corrected depth BigWig | | | Array[File] | sv_maf_bw | CNV MAF BigWig | | | Array[String] | sv_copynum_summary | CNV copy number summary JSON | | | Array[File] | bcftools_roh_out | ROH calling | bcftools roh | | Array[File] | bcftools_roh_bed | Generated from above, without filtering | | | File? | joint_sv_vcf | Joint-called structural variant VCF | | | File? | joint_sv_vcf_index | | | Mitochondrial variants and haplotypes | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | mitorsaw_vcf | Mitochondrial variant VCF | | | Array[File] | mitorsaw_vcf_index | Index for mitochondrial variant VCF | | | Array[File] | mitorsaw_hap_stats | Mitochondrial haplotype stats | | Tandem Repeat Genotyping | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | phased_trgt_vcf | Phased TRGT VCF | | | Array[File] | phased_trgt_vcf_index | | | | Array[File] | trgt_spanning_reads | TRGT spanning reads | | | Array[File] | trgt_spanning_reads_index | | | | Array[File] | trgt_coverage_dropouts | TRGT coverage dropouts | | | Array[String] | stat_trgt_genotyped_count | Count of genotyped sites | | | Array[String] | stat_trgt_uncalled_count | Count of ungenotyped sites | | | File? | joint_trgt_vcf | Joint-called TRGT VCF | | | File? | joint_trgt_vcf_index | | | Variant Phasing | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | phase_stats | Phasing stats | | | Array[File] | phase_blocks | Phase blocks | | | Array[File] | phase_haplotags | Per-read haplotag assignment | | | Array[String] | stat_phased_basepairs | Count of bp within phase blocks | | | Array[String] | stat_phase_block_ng50 | Phase block NG50 | | Variant Calling in Dark Regions | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File?] | paraphase_output_json | Paraphase output JSON | | | Array[File?] | paraphase_realigned_bam | Paraphase realigned BAM | | | Array[File?] | paraphase_realigned_bam_index | | | | Array[File?] | paraphase_vcfs | Paraphase VCFs | Compressed as .tar.gz | 5mCpG Methylation Calling | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File?] | cpg_hap1_bed | CpG hap1 BED | | | Array[File?] | cpg_hap1_bed_index | | | | Array[File?] | cpg_hap2_bed | CpG hap2 BED | | | Array[File?] | cpg_hap2_bed_index | | | | Array[File?] | cpg_combined_bed | CpG combined BED | | | Array[File?] | cpg_combined_bed_index | | | | Array[File?] | cpg_hap1_bw | CpG hap1 BigWig | | | Array[File?] | cpg_hap2_bw | CpG hap2 BigWig | | | Array[File?] | cpg_combined_bw | CpG combined BigWig | | | Array[String] | stat_cpg_hap1_count | Hap1 CpG count | | | Array[String] | stat_cpg_hap2_count | Hap2 CpG count | | | Array[String] | stat_cpg_combined_count | Combined CpG count | | | Array[File?] | methbat_profile | MethBat CpG profile | | | Array[String] | stat_methbat_methylated_count | Count of profiled regions labeled as methylated | | | Array[String] | stat_methbat_unmethylated_count | Count of profiled regions labeled as unmethylated | | | Array[String] | stat_methbat_asm_count | Count of profiled regions labeled as having allele specific methylation | | PGx Typing | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | Array[File] | pbstarphase_json | PBstarPhase JSON | Haplotype calls for PGx loci | | Array[File?] | pharmcat_match_json | PharmCAT match JSON | | | Array[File?] | pharmcat_phenotype_json | PharmCAT phenotype JSON | | | Array[File?] | pharmcat_report_html | PharmCAT report HTML | | | Array[File?] | pharmcat_report_json | PharmCAT report JSON | | Tertiary Analysis | Type | Name | Description | Notes | | ---- | ---- | ----------- | ----- | | File? | pedigree | Pedigree file in PLINK PED format | | | File? | tertiary_small_variant_filtered_vcf | Filtered, annotated small variant VCF | | | File? | tertiary_small_variant_filtered_vcf_index | | | | File? | tertiary_small_variant_filtered_tsv | Filtered, annotated small variant calls | | | File? | tertiary_small_variant_compound_het_vcf | Filtered, annotated compound heterozygous small variant VCF | | | File? | tertiary_small_variant_compound_het_vcf_index | | | | File? | tertiary_small_variant_compound_het_tsv | Filtered, annotated compound heterozygous small variant calls | | | File? | tertiary_sv_filtered_vcf | Filtered, annotated structural variant VCF | | | File? | tertiary_sv_filtered_vcf_index | | | | File? | tertiary_sv_filtered_tsv | Filtered, annotated structural variant TSV | |

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average