Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Simulated dataset (Almeida et al., 2018 - GigaScience) treated with various clustering programs to evaluate ReClustOR efficiency and constitency

Authors: TERRAT, Sébastien; DJEMIEL, Christophe; JOURNAY, Corentin; DEQUIEDT, Samuel; KARIMI, Battle; HORRIGUE, Walid; MARON, Pierre-Alain; +2 Authors

Simulated dataset (Almeida et al., 2018 - GigaScience) treated with various clustering programs to evaluate ReClustOR efficiency and constitency

Abstract

ReClustOR is a novel clustering method that overcomes some of the problems associated with classical ‘heuristic’clustering methods and consequently increases the stability and quality of the reconstructed OTUs. Moreover, the OTUs database defined with ReClustOR can be used as reference(s) with gradual enrichment of it, with new studies and samples. In this way, huge datasets like the Earth Microbiome Project can be easily used as references for smaller projects, thereby increasing the quality of comparisons between studies and datasets Here, we propose a new approach called ReClustOR (for RE-CLUSTering method using an Open-Reference approach) to improve OTU consistency (see https://doi.org/10.5281/zenodo.2597402). This new strategy combines two of the previously-described clustering methods. Firstly, a classical clustering method (e.g. SWARM, or VSEARCH) is used to define OTU centroids and create a reference database. Secondly, a closed- or open-reference method (depending on the user’s choice) is computed for all reads which are not considered as OTU centroids. Contrary to the classical clustering methods, each read is compared to all centroids using a distance-based greedy clustering technique (Edgar, 2010; He et al., 2015), and then assigned to the nearest one, thereby fixing the erroneous assignments of reads to OTUs. To highlight the improvements provided by ReClustOR in describing microbial diversity in terms of ecological diversity metrics (e.g. richness, OTU composition, Shannon, 1/Simpson) and taxonomic composition, a simulated dataset was subjected to: (i) ESV definition, (ii) multiple conventional de novo methods (i.e. a homemade de novo clustering close to CRUNCHCLUST, VSEARCH and SWARM), and (iii) ReClustOR computation. This dataset is a simulated one (Almeida et al., 2018), containing a diverse set of genera commonly found in three ecosystems different ecosystems: human gut, ocean and soil. The clustering methods were compared for: (i) their ability to describe microbial richness, (ii) the congruence between OTU assignments and sequences taxonomy, (iii) the robustness of each defined OTU, and (iv) their ability to efficiently describe the microbial community based on OTU composition. Here, the simulated dataset (00_Raw_data) and all steps of analysis are available to resue them to test ReClustOR, and also to have a better understanding of files and data produced by this program. More details are available in the Tree_of_data.tree file.

{"references": ["Alexandre Almeida, Alex L Mitchell, Aleksandra Tarkowska, Robert D Finn, Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments, GigaScience, Volume 7, Issue 5, May 2018, giy054, https://doi.org/10.1093/gigascience/giy054"]}

Keywords

Simualted dataset, ReClustOR, VSEARCH, SWARM, Clustering, OTU

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 6
  • 6
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
6