Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
License: CC 0
Data sources: ZENODO
DRYAD
Dataset . 2021
License: CC 0
Data sources: Datacite
versions View all 2 versions
addClaim

A machine learning approach to integrating genetic and ecological data in tsetse flies (Glossina pallidipes) for spatially explicit vector control planning

Authors: Bishop, Anusha; Amatulli, Giuseppe; Hyseni, Chaz; Pless, Evlyn; Bateta, Rosemary; Okeyo, Winnie; Mireji, Paul; +5 Authors

A machine learning approach to integrating genetic and ecological data in tsetse flies (Glossina pallidipes) for spatially explicit vector control planning

Abstract

Introduction - Control of vector populations is an effective strategy for addressing vector-borne disease transmission. Effective vector control requires knowledge of habitat use and connectivity. Our goal was to improve this knowledge for the tsetse species Glossina pallidipes, a vector of animal African trypanosomiasis, which is a wasting disease in livestock and represents a serious socioeconomic burden across sub-Saharan Africa. Methods and Results - We used random forest regression to: (i) Build and integrate models of G. pallidipes habitat suitability and genetic connectivity across Kenya and northern Tanzania, and (ii) provide novel vector control recommendations. Inputs for the models included field-survey records from 349 trap locations, genetic data from 11 microsatellite loci from 659 flies and 29 sampling sites, and remotely sensed environmental data. The suitability and connectivity models explained approximately 80% and 67% of the variance in the occurrence and genetic data, and exhibited high accuracy based on cross-validation. The bivariate map showed that suitability and connectivity vary independently across the landscape and inform vector control recommendations. Post-hoc analyses show spatial variation in the correlations between the most important environmental predictors from our models and each response variable (e.g. suitability and connectivity) as well as heterogeneity in expected future climatic change of these predictors. Discussion - The bivariate map suggests vector control is most likely to be successful in the Lake Victoria basin, and supports the previous recommendation that most of eastern Kenya should be managed as a single unit. We further recommend that future monitoring efforts should focus on tracking potential changes in vector presence and dispersal around the Serengeti and the Lake Victoria basin based on projected local climatic shifts. The strong performance of the spatial models suggests potential for our integrative methodology to be used to understand future impacts of climate change in this and other vector systems. 

The Bishop2021_HabitatSuitability_Data.csv file contains the data used in the habitat suitability model (i.e. information about the trap locations). Abbreviations: TrapNo (Trap Number), Lat (Latitude), Long (Longitude), NumberDays (number of days between StartDate (date traps were set out) and EndDate (date flies were collected from traps)). The Bishop2021_GenConModel_AllData.csv file contains the data used in the genetic connectivity model. All columns starting with "BIO" are the median values of each bioclimatic variable along straight paths between sites. The "kernel" column contains the median values along straight paths between sites from the kernel density layer. The "pixvals" column contains the geographic distance between sites in units of pixels (1 km resolution). The "Distance" column contains the Cavalli-Sforza and Edwards’ chord (CSE) genetic distances between sites. See methods of the paper (Bishop et al., 2021) for more detail. The Gpd_KenTza_11loci_659indv_genepop.txt file contains the microsatellite genotypes for the 659 individuals used in this study in GenePop format (https://genepop.curtin.edu.au/) and the Gpd_KenTza_11loci_659indv_sample_info.csv file provides information about these individuals.

A description of the methods used to collect and process this dataset is available in the corresponding paper (Bishop et al., 2021).

Keywords

habitat suitability, spatial modeling, FOS: Biological sciences, Landscape genetics, random forest

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 19
    download downloads 8
  • 19
    views
    8
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
19
8