Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software
Data sources: ZENODO
addClaim

Data from: On the utility of deep learning for model classification and parameter estimation on complex diversification scenarios

Authors: Gutiérrez de la Peña, Pablo; Iglesias, Guillermo; Talavera, Edgar; Meseguer, Andrea; Sanmartín, Isabel;

Data from: On the utility of deep learning for model classification and parameter estimation on complex diversification scenarios

Abstract

Birth-Death models applied to dated phylogenies are a useful tool to study past diversification dynamics. Parameters in these stochastic models are typically inferred using likelihood-based methods such as Maximum Likelihood Estimation (MLE) or Bayesian Inference, though some of the most complex models present computational tractability issues. Recent years have witnessed the development of Deep Learning (DL) methods applied to evolutionary biology and phylogenetic inference. Here, we explore the power of Convolutional Neural Networks (CNNs), a type of DL method, to solve classification and regression (parameter estimation) tasks under six different rate-constant and rate-variable diversification scenarios: Constant Birth-Death, High-Extinction, Mass-Extinction, Diversity-Dependent, Stasis-and-Radiate, and Waxing-and-Waning. We simulated 10,000 phylogenetic trees under each diversification scenario, which were encoded using a vectorization procedure that captures the topology and branch length information. The encoded trees were used to train and test a set of CNN models that were designed to tailor three empirical case studies differing in the number of tips. We compared the CNN's performance with MLE inference. Our results show that CNNs exhibited classification accuracy levels of 90-80\%, whereas maximum likelihood estimation achieved levels of 69-60\%, using AIC as model selection criterion. The most difficult scenarios to predict for the CNNs were the high-extinction and mass-extinction scenarios, which were often misidentified as one another. For the regression tasks, CNN models obtained generally lower mean average errors than MLE inference, irrespective of the number of tips in the simulated phylogenies, though differences were small. The only exception was the discrete time event parameter in the episodic diversification scenarios (Mass-Extinction, Stasis-and-Radiate, and Waxing-and-Waning), in which MLE inference showed a lower error than the CNNs. Finally, we illustrate and discuss the application of our CNNs to real-world phylogenies, using three classic empirical case studies: eucalypts, conifers, and cetaceans.

Powered by OpenAIRE graph
Found an issue? Give us feedback