
Repository containing data and machine learning workflows used to identify environmental drivers controlling fluorescent dissolved organic matter (fDOM) dynamics in freshwater systems: https://github.com/danielmerbet/driver_attribution_fdom/ The repository accompanies the preprint (the paper was accepted, updated link will be added soon): A machine learning approach to driver attribution of dissolved organic matter dynamics in two contrasting freshwater systems https://doi.org/10.5194/egusphere-2025-4049 Overview Dissolved organic matter (DOM) dynamics are influenced by multiple environmental drivers including hydrology, meteorology, and seasonal cycles. This repository provides: Data used in the study Machine learning workflows Feature importance analysis SHAP interpretation of models Scripts to reproduce all results The workflow combines multiple machine learning algorithms: Random Forest XGBoost LightGBM CatBoost Kernel methods k-nearest neighbors These models are used to identify the most influential drivers controlling fDOM variability across two contrasting freshwater systems. Study sites Two study sites were analyzed:Lough Feeagh (Ireland): humic oligotrophic lake with a peatland-dominated catchment and temperate oceanic climateSau Reservoir (Spain): eutrophic reservoir with a human-influenced catchment and Mediterranean climate Repository structure driver_attribution_fdom/│├── README.md│├── 1_hyperparameter_tuning.R│ Hyperparameter optimization for all ML models│├── 2_MLrun_most-influential-features.R│ ML simulations using selected drivers│├── 3_MLrun_reanalysis-julianday.R│ ML simulations using reanalysis meteorology + seasonal predictors│├── 4_extract_importance.R│ Extraction of feature importance across ML models│├── 5_shap_analysis.py│ SHAP analysis for model interpretation│├── feeagh/│ ├── data/│ └── output/│├── sau/│ ├── data/│ └── output/│├── figures/│ Figures used in the manuscript│├── codes_supplementary/│ Additional scripts used in supplementary analyses│└── old_codes/ Archived development scripts
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
