
Global Standardised Soil Eukaryome Dataset (GloSED) Dataset description The GloSED dataset is a metabarcoding-based dataset encompassing the entire spectrum of soil eukaryotes collected and analysed using standardized protocols. Key characteristics - Sampling sites: 4,147 globally distributed locations across 121 countries - Taxonomic scope: Complete soil eukaryome including fungi, protists, animals, and plants - Operational taxonomic units: 988,824 curated OTUs - Sequencing technology: PacBio long-read sequencing of full-length ITS Data collection and processing - Standardized sampling design: 50x50 m plots - Soil cores: 40 cores per plot (5 cm diameter x 5 cm depth), pooled by volume - DNA extraction: PowerMax Soil DNA Isolation kit (Qiagen) with FavorPrep cleanup - Primers: universal eukaryotic primers ITS9mun/ITS4ngsUni - Processing: NextITS v.1.0.0 workflow (DOI: 10.5281/zenodo.15074882) - Taxonomic annotation: EUKARYOME v.1.9.4 database (DOI: 10.1093/database/baae043) Data files and formats Core data - `GloSED__OTU_sequences.fasta.gz`: Quality-filtered representative sequences for all OTUs, FASTA format - `GloSED__OTU_table.tsv.zip`: Sample-by-OTU abundance matrix (TSV format) - `GloSED__Taxonomy.tsv.zip`: Complete taxonomic annotations with UNITE-based species hypotheses (TSV format) - `GloSED__OTU_table.parquet`: Columnar format of abundance data for efficient querying (Parquet format) - `GloSED__Taxonomy.parquet`: Columnar format of taxonomic data (Parquet format) - `GloSED__phyloseq.RData`: phyloseq object for R-based analyses - `GloSED__BIOM.biom`: BIOM v.2.1 format compatible with QIIME2 Metadata files - `GloSED__Sample_metadata.xlsx`: Sample metadata - `DRI.json`: Data Reuse Information tag with ORCID identifiers - `DRI.csv`: Tabular format mapping accession IDs - `Contributors.xlsx`: List of contributors Data reuse information This dataset includes Data Reuse Information (DRI) tags to support equitable data sharing (Hug et al., 2025). The DRI identifies creators who prefer to be contacted before reuse: DRI: `{0000-0002-1635-1249, 0000-0003-2786-2690}` Please contact these individuals prior to reuse of the data. Related resources Raw sequence data: European Nucleotide Archive (ENA) project: PRJEB103811 Sample accessions: ERS27941879 - ERS27946063 Sequence accessions: ERR15957609 - ERR15964175 Bioinformatics pipeline: NextITS Reference databases: EUKARYOME, UNITE
eukaryotes, metabarcoding, soil, biodiversity
eukaryotes, metabarcoding, soil, biodiversity
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
