Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

CNR Ozone Sounding Merged (COSM) Dataset

Unified database of ozonesounding profiles from existing global archives
Authors: Marra, Fabrizio; MADONNA, FABIO; Tramutola, Emanuele;

CNR Ozone Sounding Merged (COSM) Dataset

Abstract

The unified database of ozonesounding profiles was obtained through the merging of three existing ozonesounding datasets, provided by the Southern Hemisphere Additional OZonesondes (SHADOZ), the Network for the Detection of Atmospheric Composition Change (NDACC), and the World Ozone and Ultraviolet Radiation Data Centre (WOUDC). Only a selected set of variables of interest, both data and metadata, were considered to build the unified dataset, due to the heterogeneous formats and varying levels of detail provided by each network, even when referring to measurements shared across different initiatives. These variables are listed in the following Table. Standard name Description Unit idstation The name of the station. N.A. location_latitude Latitude of station. deg location_longitude Longitude of station. deg lacation_height Height is defined as the altitude, elevation, or height of the defined platform + instrument above sea level. m date_of_observation Date when the ozonesonde was launched (in format yyyy-mm-dd hh:mm:ss with time zone). N.A. time Elapsed flight time since released. s pressure Atmospheric pressure of each level in Pascals. Pa geop_alt Geopotential height in meters. m temperature Air temperature in degrees Kelvin. K relative_humidity Relative humidity in 1. 1 wind_speed Wind speed in meters per seconds. m/s wind_direction Wind direction in degrees. deg latitude Observation latitude (during the flight). deg longitude Observation longitude (during the flight). deg altitude Height of sensor above local ground or sea surface. Positive values for above surface (e.g., sondes), negative for below (e.g., xbt). For visual observations, the height of the visual observing platform. m (a. s. l.) sample_temperature Temperature where sample is measured in degrees Kelvin. K o3_partial_pressure The level partial pressure of ozone in Pascals. Pa ozone_concentration The level mixing ratio of ozone in ppmv. ppmv ozone_partial_pressure_total_uncertainty Total uncertainty in the calculation of the ozone partial pressure as a composite of the individual uncertainty contribution. Uncertainties due to systematic bias are assumed as random and follow a random normal distribution. The uncertainty calculation also accounts for the increased uncertainty incurred by homogenizing the data record. Pa network Source network of the profile. N.A. type Station classification flag. N.A. vertical_coverage_flag Boolean flag indicating whether the ozone profile reaches the 10 hPa pressure level. Set to 't' if the profile exceeds 10 hPa, 'f' otherwise. N.A. vertical_completeness_flag Boolean flag indicating whether the ozone profile contains at least one data point every 100 meters throughout its vertical extent. Set to 't' if the profile is vertically complete (i.e., no gaps larger than 100 meters), 'f' otherwise. N.A. outliers_flag Boolean flag indicating whether the ozone partial pressure profile (o3_partial_pressure) contains strong outliers, based on the ±3·IQR method. Set to 't' if no strong outliers are found, 'f' otherwise. N.A. time_series_completeness_flag Boolean flag indicating whether the time series for a given station includes at least three ozone profiles per month, allowing up to 5% of months without coverage. Set to 't' if this criterion is met, 'f' otherwise. N.A. filter_check Profile quality control flag. N.A. The dataset is organized into two main tables: unified_header, which contains metadata associated with each ozonesounding profile (idstation, date_of_observation, location_latitude, location_longitude, location_height, network, type, filter_check, vertical_coverage_flag, vertical_completeness_flag, outliers_flag, time_series_completeness_flag); unified_value, which includes the actual measurement data (idstation, date_of_observation, time, pressure, geop_alt, temperature, relative_humidity, wind_speed, wind_direction, latitude, longitude, altitude, sample_temperature, o3_partial_pressure, ozone_concentration, ozone_partial_pressure_total_uncertainty). To improve accessibility and performance, both tables are further subdivided into year-specific subtables, allowing for more efficient querying and data management across temporal ranges. Among the metadata variables included in the unified_header table, type and filter_check play a key role in characterizing the quality and coverage of the ozonesounding profiles. The type variable classifies each station based on the continuity of its time series: stations are grouped into Long Coverage (G), Medium Coverage (Y), or Short Coverage (R), depending on whether they provide at least one profile per month for at least 95% of the months in their time series, spanning: ≥20 years for Long Coverage, ≥10 and <20 years for Medium Coverage, <10 years for Short Coverage. The filter_check variable is a quality control flag ranging from 0 to 4, summarizing the results of four structural checks applied to each profile: completeness of monthly coverage (at least three ascents per month), vertical coverage (reaching at least 10 hPa), vertical resolution (minimum one data point every 100 meters), and detection of strong outliers (values in ozone profiles beyond ±3·IQR). A higher filter_check value indicates better compliance with these criteria and, consequently, higher data reliability. The individual flags corresponding to each control are also provided in the dataset, allowing users to apply custom quality filters based on their specific research needs. In addition to the dataset, two log files are provided to ensure full transparency of the quality control process and to allow users to trace all data removals and better understand the filtering criteria applied during dataset construction: delete_outliers.log: lists all strong outlier values removed from the dataset. Each entry includes the station identifier, the profile date, the pressure level, and the corresponding outlier value of o3_partial_pressure. delete_wrong_profile.log: contains all ozone profiles that were entirely removed due to being considered erroneous. These profiles typically exhibit values consistently close to zero or deviate significantly from the station’s seasonal climatology. Each entry is catalogued by station and launch date. Furthermore, an algorithm was implemented able to merge the different datasets by handling their different features and duplicated profiles, i.e. profiles from different networks recorded within a 2-hour time window. In such cases, the profile that passes the greatest number of quality control (filter_check) tests is retained in the unified dataset. If multiple profiles meet the same number of quality control criteria, the selection is refined using additional indicators of dataset maturity, such as the availability of metadata, documentation, peer-reviewed publications, and especially the presence of measurement uncertainties associated with ozone concentration profiles. This last criterion is prioritized, as uncertainties are routinely provided in SHADOZ and, only for a limited number of profiles, in NDACC, while they are generally absent in WOUDC.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average