Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Pv4: An open dataset of Plasmodium vivax genome variation in 1,895 worldwide samples

Authors: MalariaGEN Parasite;

Pv4: An open dataset of Plasmodium vivax genome variation in 1,895 worldwide samples

Abstract

This dataset contains files relating to the Pv4 dataset, which contains genome variation data on 1,895 worldwide samples of Plasmodium vivax. The key publication is MalariaGEN et al, Wellcome Open Research 2022, 7:136 https://doi.org/10.12688/wellcomeopenres.17795.1 This record contains details on contributing partner studies, sample metadata and key sample attributes inferred from genomic data. Further details and analytical results can be found in the accompanying data release paper. These data are available open access. Publications using these data should acknowledge and cite the source of the data using the following format: "This publication uses data from the MalariaGEN Plasmodium vivax Genome Variation Project as described in ‘An open dataset of Plasmodium vivax genome variation in 1,895 worldwide samples’. MalariaGEN et al. Wellcome Open Research 2022, 7:136 https://doi.org/10.12688/wellcomeopenres.17795.1 Study information: Details of the 11 contributing partner studies, and 3 external studies, including description, contact information and key people. Sample provenance and sequencing metadata: sample information including partner study information, location and year of collection, ENA accession numbers, and QC information for 1,895 samples from 27 countries. Measure of complexity of infections: characterisation of within-host diversity (FWS) for 1,072 QC pass samples. Drug resistance marker genotypes: genotypes at known markers of drug resistance for 1,895 samples, containing amino acid and copy number genotypes at 3 loci: dhfr, dhps, mdr1. Inferred resistance status classification: classification of 1,072 QC pass samples into different types of resistance to 4 drugs or combinations of drugs: pyrimethamine, sulfadoxine, mefloquine, and sulfadoxine-pyrimethamine combination. Drug resistance markers to inferred resistance status: details of the heuristics utilised to map genetic markers to resistance status classification. Tandem duplication genotypes: genotypes for tandem duplications discovered in four regions of the genome. Genome regions and Genome regions index: a bed file classifying genomic regions as core genome or different classes of non-core genome in addition to tabix index file for genome regions file. Reference genome and genome annotations file: used in variant calling pipelines. A README file describes in detail all the files included in the release, the format and interpretation of each column, and contains some tips and tricks for accessing the genotype data in VCF and zarr files.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average