Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: ZENODO
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY NC ND
Data sources: Datacite
versions View all 4 versions
addClaim

GEMS-GER: A Machine Learning Benchmark Dataset of Long-Term Groundwater Levels in Germany with Meteorological Forcings and Site-Specific Environmental Features

Groundwater Levels, Environment, Meteorology, Site Properties
Authors: Ohmer, Marc;

GEMS-GER: A Machine Learning Benchmark Dataset of Long-Term Groundwater Levels in Germany with Meteorological Forcings and Site-Specific Environmental Features

Abstract

This repository provides the dataset accompanying the GEMS-GER (Groundwater Levels, Environment, Meteorology, Site Properties – Germany) benchmark for machine learning-based groundwater modeling. The dataset includes long-term groundwater level time series, meteorological and hydrological forcing data, site-specific environmental properties, and benchmark model evaluation results. All data originate from official public sources and have been harmonized across the 16 German federal states (Bundesländer). Contents of this repository: Groundwater level time series (GEMS-GER_data/dynamic/*.csv): Weekly aggregated groundwater levels (GWL) from 3,207 monitoring wells (1991–2022), including: Daily temperature (mean, min, max) Precipitation and humidity (HYRAS/DWD) Real, potential, and reference evapotranspiration Soil moisture and soil temperature (5 m) Snow water equivalent, snowmelt, and runoff (ERA5-Land) GWL_flag indicating observed vs. imputed values Site-specific static descriptors (GEMS-GER_data/static/static_features.csv): Hydrogeology and soil type Land use and climate classification Elevation and derived topographic parameters (e.g. slope, TWI) Benchmark model evaluation results (GEMS-GER_data/static/model_performance.csv): NSE, RMSE, R², and Bias scores for ML models applied to each well Pre-generated time series plots (GEMS-GER_data_figures/*.pdf): Visualizations of groundwater levels and selected forcing variables for all wells Provided separately to reduce the size of the main dataset download Directory structure: GEMS-GER_data/├── dynamic/ # 3,207 individual CSV files, one per well│ ├── MW_1.csv│ ├── MW_2.csv│ └── ...├── static/│ ├── static_features.csv # Site-specific static descriptors (e.g. geology, land use, climate)│ └── model_performance.csv # ML model evaluation metrics (NSE, RMSE, R², Bias)├── license_information.txt # Licensing details for groundwater level data from federal state sources└── README.md # Dataset description and usage notes GEMS-GER_data_figures/├── DYN_Feat_MW_1.pdf├── DYN_Feat_MW_2.pdf└── ... The dataset is intended for research and benchmarking in hydrogeology, data-driven groundwater modeling, and environmental machine learning. It forms the basis of the GEMS-GER benchmark, as described in the associated preprint. All data originate from public sources and have been harmonized across administrative and institutional boundaries to enable consistent large-scale analysis.

Keywords

groundwater levels, hydrogeology, environmental data, Germany, machine learning benchmark, meteorological forcing, groundwater monitoring, geospatial dataset, hydrology, time series

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average