Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
Data sources: ZENODO
ZENODO
Dataset . 2025
Data sources: Datacite
ZENODO
Dataset . 2025
Data sources: Datacite
ZENODO
Dataset . 2024
Data sources: Datacite
versions View all 3 versions
addClaim

Twigstats scripts and example dataset

Authors: Speidel, Leo;

Twigstats scripts and example dataset

Abstract

This repository provides all scripts to run Relate and Twigstats on imputed ancient genomes. We also provide a complete self contained example dataset, but you should be able to use the exact same scripts on your own datasets as well. Installation Please install bcftools if you haven't already (https://samtools.github.io/bcftools/howtos/install.html). Please make sure that BCFTOOLS_PLUGINS is set to the correct plugin path (see bcftools link). Please download Relate from https://myersgroup.github.io/relate/ Please also install the R package Twigstats from https://leospeidel.github.io/twigstats/ Optional: For plotting purposes and downstream analyses, please install the R packages relater from https://github.com/leospeidel/relater/ ggplot2 dplyr tidyr plyr umap Download the data The tar ball example_data_chr1.tgz stores files for only chromosome 1, whereas example_data_wg.tgz stores files for the whole genome. Only one or the other need to be downloaded to run this example. Please extract tar balls, e.g. using tar -xzvf example_chr1.tgz. Running the scripts In each directory, the script run.sh shows how to run everything in order. You can find the individual scripts that are being called under scripts/. To run this on your own dataset please download scripts.tgz. We provide ancestral genomes, genomic mask files, precomputed coalescence rates, and recombination rates for hg37 and hg38 here https://doi.org/10.5281/zenodo.15179497. Input files The directory example_data_chr1 stores files for only chromosome 1, whereas example_data_wg stores files for the whole genome. Under example_data_wg/ and example_data_chr1/ you will find the following files: GLIMPSE imputed vcf, here named ancients_glimpse2_chr1.vcf.gz. Modern vcf (e.g. 1000G), here named 1000GP_sub_chr1.vcf.gz. A poplabels file listing population labels for each individual. Individuals have to appear in the same order as in the merged vcf file. The file should contain four columns: ID POP GROUP SEX. The second column is used for population assignment. A second poplabels file used for the f4-ratio analysis. Again the second column shows the groupings of individuals into populations. A third poplabels file used for the MDS analysis. The second column should now list IDs of all individuals plotted in the MDS (i.e. should be identical to first column). The outgroup should be grouped together into one population. File containing sample ages in generations, two lines per sample (diploid), e.g. for 3 samples of ages 0, 10, and 100 generations:001010100100 We provide all the other required Relate input files under Relate_input_files/. You can reuse these in your analysis. In this example, we are using data from the 1000 Genomes Project dataset (Nature 2015). We additionally use low coverage shotgun genomes from Anglo-Saxon contexts, British Iron/Roman Age, Irish Bronze Age, and the Scandinavian Early Iron Age (Cassidy et al, PNAS 2016; Martiniano et al, Nature Communications 2016; Anastasiadou et al, Communications Biology 2023; Schiffels et al Nature Communications 2016; Gretzinger et al Nature 2022; Rodriguez-Varela et al Cell 2023). These were imputed using GLIMPSE (https://odelaneau.github.io/GLIMPSE). Step by step guide Please follow run.sh. This script will Run scripts/1_prep_vcf.sh to filter the imputed genotypes. Then run scripts/2_prep_Relate.sh to prepare Relate input files Finally run scripts/3_run_Relate.sh to estimate genealogies We can use these Relate files for various analyses: You can run Twigstats and infer admixture proportions using Rscript scripts/4_run_Twigstats.R. You can estimate coalescence rates and population sizes using Rscript scripts/5_plot_popsize.R. You can run an MDS using Rscript scripts/6_plot_MDS.R. To see the arguments required in each script, you can execute the script without arguments, e.g. by executing scripts/1_prep_vcf.sh or Rscript scripts/4_run_Twigstats.R. The expected output is shown in the attached pdf.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average