
This record is a global open-source passenger air traffic dataset primarily dedicated to the research community. It gives a seating capacity available on each origin-destination route for a given year, 2019, and the associated aircraft and airline when this information is available. Context on the original work is given in the related articles (https://doi.org/10.59490/joas.2024.7365, https://doi.org/10.59490/joas.2023.7201) and on the associated GitHub page (https://github.com/AeroMAPS/AeroSCOPE/).A simple data exploration interface will be available at www.aeromaps.eu/aeroscope.The dataset was created by aggregating various available open-source databases with limited geographical coverage. It was then completed using a route database created by parsing Wikipedia and Wikidata, on which the traffic volume was estimated using a machine learning algorithm (XGBoost) trained using traffic and socio-economical data. 1- DISCLAIMER The dataset was gathered to allow highly aggregated analyses of the air traffic, at the continental or country levels. At the route level, the accuracy is limited as mentioned in the associated article and improper usage could lead to erroneous analyses. Although all sources used are open to everyone, the Eurocontrol database is only freely available to academic researchers. It is used in this dataset in a very aggregated way and under several levels of abstraction. As a result, it is not distributed in its original format as specified in the contract of use. As a general rule, we decline any responsibility for any use that is contrary to the terms and conditions of the various sources that are used. In case of commercial use of the database, please contact us in advance. 2- DESCRIPTION Each data entry represents an (Origin-Destination-Operator-Aircraft type) tuple. Please refer to the support article for more details (see above). The dataset contains the following columns: "First column" : index airline_iata : IATA code of the operator in nominal cases. An ICAO -> IATA code conversion was performed for some sources, and the ICAO code was kept if no match was found. acft_icao : ICAO code of the aircraft type acft_class : Aircraft class identifier, own classification. WB: Wide Body NB: Narrow Body RJ: Regional Jet PJ: Private Jet TP: Turbo Propeller PP: Piston Propeller HE: Helicopter OTHER seymour_proxy: Aircraft code for Seymour Surrogate (https://doi.org/10.1016/j.trd.2020.102528), own classification to derive proxy aircraft when nominal aircraft type unavailable in the aircraft performance model. source: Original data source for the record, before compilation and enrichment. ANAC: Brasilian Civil Aviation Authorities AUS Stats: Australian Civil Aviation Authorities BTS: US Bureau of Transportation Statistics T100 Estimation: Own model, estimation on Wikipedia-parsed route database Eurocontrol: Aggregation and enrichment of R&D database OpenSky World Bank seats: Number of seats available for the data entry, AFTER airport residual scaling n_flights: Number of flights of the data entry, when available iata_departure, iata_arrival : IATA code of the origin and destination airports. Some BTS inhouse identifiers could remain but it is marginal. departure_lon, departure_lat, arrival_lon, arrival_lat : Origin and destination coordinates, could be NaN if the IATA identifier is erroneous departure_country, arrival_country: Origin and destination country ISO2 code. WARNING: disable NA (Namibia) as default NaN at import departure_continent, arrival_continent: Origin and destination continent code. WARNING: disable NA (North America) as default NaN at import seats_no_est_scaling: Number of seats available for the data entry, BEFORE airport residual scaling distance_km: Flight distance (km) ask: Available Seat Kilometres rpk: Revenue Passenger Kilometres (simple calculation from ASK using IATA average load factor) fuel_burn_seymour: Fuel burn per flight (kg) when seymour proxy available fuel_burn: Total fuel burn of the data entry (kg) co2: Total CO2 emissions of the data entry (kg) domestic: Domestic/international boolean (Domestic=1, International=0) 3- Citation Please cite the support paper instead of the dataset itself. Salgas, A., Sun, J., Delbecq, S., Planès, T., & Lafforgue, G. (2024). Compilation and Applications of an Open-Source Dataset on Global Air Traffic Flows and Carbon Emissions. Journal of Open Aviation Science. https://doi.org/10.59490/joas.2023.7201
Flows, Traffic, CO2, Aviation, Open-Source, Open-Data
Flows, Traffic, CO2, Aviation, Open-Source, Open-Data
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
