
This dataset contains files relating to the Pv4 dataset, which contains genome variation data on 1,895 worldwide samples of Plasmodium vivax. The key publication is MalariaGEN et al, Wellcome Open Research 2022, 7:136 https://doi.org/10.12688/wellcomeopenres.17795.1 This record contains details on contributing partner studies, sample metadata and key sample attributes inferred from genomic data. Further details and analytical results can be found in the accompanying data release paper. These data are available open access. Publications using these data should acknowledge and cite the source of the data using the following format: "This publication uses data from the MalariaGEN Plasmodium vivax Genome Variation Project as described in ‘An open dataset of Plasmodium vivax genome variation in 1,895 worldwide samples’. MalariaGEN et al. Wellcome Open Research 2022, 7:136 https://doi.org/10.12688/wellcomeopenres.17795.1 Study information: Details of the 11 contributing partner studies, and 3 external studies, including description, contact information and key people. Sample provenance and sequencing metadata: sample information including partner study information, location and year of collection, ENA accession numbers, and QC information for 1,895 samples from 27 countries. Measure of complexity of infections: characterisation of within-host diversity (FWS) for 1,072 QC pass samples. Drug resistance marker genotypes: genotypes at known markers of drug resistance for 1,895 samples, containing amino acid and copy number genotypes at 3 loci: dhfr, dhps, mdr1. Inferred resistance status classification: classification of 1,072 QC pass samples into different types of resistance to 4 drugs or combinations of drugs: pyrimethamine, sulfadoxine, mefloquine, and sulfadoxine-pyrimethamine combination. Drug resistance markers to inferred resistance status: details of the heuristics utilised to map genetic markers to resistance status classification. Tandem duplication genotypes: genotypes for tandem duplications discovered in four regions of the genome. Genome regions and Genome regions index: a bed file classifying genomic regions as core genome or different classes of non-core genome in addition to tabix index file for genome regions file. Reference genome and genome annotations file: used in variant calling pipelines. A README file describes in detail all the files included in the release, the format and interpretation of each column, and contains some tips and tricks for accessing the genotype data in VCF and zarr files.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
