
Supporting datasets and ensemble members/submembers for the PASTEL model. Brief description of individual entries: v0_1_5_Awakens.csv — A merged dataset combining multiple airborne campaigns with supplementary 24-hour backward trajectory information. Represents the input samples used to train PASTEL. Koppen_npy_files.zip — Numpy arrays containing merged land (Beck et al. 2023) and ocean (Walterscheid 2011) Köppen climate classifications at 0.5° x 0.5° global resolution. Includes a Matplotlib colormap (Python .pkl), following Beck et al. (2023), along with alternative Köppen representations. worldcities.zip — Simplemaps basic dataset (see attribution and license within). df_preprocessed.csv — A preprocessed version of v0_1_5_Awakens.csv containing additional derived features and statistics. Can be used to bypass preprocessing steps in the main PASTEL notebook. AllTrajectories.zip — All 24-hour backward HYSPLIT trajectories generated for each sample in v0_1_5_Awakens.csv and df_preprocessed.csv, with varying meteorological inputs (see associated publication for details). ne_10m_land.zip — Natural Earth shapefile containing 10-meter resolution land boundaries. ERA5_32yr_monthly_avg.nc — NetCDF file containing 32-year monthly averages of ERA5 data (ozone, specific humidity, relative humidity, temperature) over the study period. ensemble.zip — Ensemble members and submembers contributing to PASTEL predictions, along with derived statistics and plots (≈27 GB uncompressed). License Code (not included here, see linked repository): GNU General Public License v3.0 (GPLv3). Data: Creative Commons Attribution–ShareAlike 4.0 International (CC-BY-SA 4.0). Third-party data (Simplemaps, Natural Earth) is redistributed under their respective licenses (see included attributions). Citation If you use this dataset, please cite: Geiser, Victor (2025). Supplementary data for the Predictive model for Atmospheric Substances and Trace pollutants in the Environment using machine Learning (PASTEL). Zenodo. https://doi.org/10.5281/zenodo.17204569 How to Use These datasets are intended for use with the PASTEL model, but may also be of independent value for climate classification, atmospheric transport analysis, or ensemble modeling. Size Warning "ensemble.zip" is roughly 27GB uncompressed as statistics/plotting information for all members/submembers is included! Contact For questions regarding this dataset or publication please contact victor.w.geiser[at]gmail.com
Lagrangian Trajectories, Meteorology, Atmospheric chemistry, Earth System Modeling, Air quality, Supervised Machine Learning
Lagrangian Trajectories, Meteorology, Atmospheric chemistry, Earth System Modeling, Air quality, Supervised Machine Learning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
