
Description This Zenodo repository contains all datasets required to construct and apply the Urban Traffic Spatio-Temporal Knowledge Graph (ST-KG) and to reproduce the experimental results reported in the accompanying paper. The datasets are organized to support a step-by-step reproducible workflow corresponding to Step 1–Step 6 in the associated code repository. Data Files and Descriptions 1. Spatial Grid (Honeycomb) Data dz_honeycomb_125.shpA regular hexagonal grid covering Shanghai, with a side length of 125 meters. This file provides the fundamental spatial units used to construct grid entities in the ST-KG. honeybuffer_125.shpA buffered version of dz_honeycomb_125.shp with slightly expanded boundaries. This file is used to accurately identify adjacency relationships between neighboring grid (honeycomb) entities. 2. Road Network Data split_result.shpA preprocessed and segmented Shanghai urban road network, where road geometries have been cleaned and split at intersections. This file is used to construct road entities, touch relationships between roads, and grid–road within relationships. roadcrs.txtStores the coordinate reference system (CRS) information of the road network data to ensure spatial consistency during computation. 3. Point of Interest (POI) Data POI/A directory containing multiple categories of Points of Interest (POIs) in Shanghai. These data are used to construct POI entities and contains relationships between POI entities and grid entities. 4. Taxi Trajectory Data Taxi_raw_data/A directory containing raw taxi trajectory data for Shanghai, covering April 2015. Each subfolder corresponds to an individual taxi and stores all recorded trajectories for that vehicle. Two sample taxi trajectory files are provided; additional data are available from the authors upon request. folder_name_list.txtA list of taxi IDs used to iterate through taxi trajectory folders during trajectory processing and state construction. 5. ST-KG Acceleration Data honeycomb_cache.pklA serialized cache extracted from the constructed ST-KG, storing within relationships between grid (honeycomb) entities and road entities. This file is used to accelerate Neo4j queries and grid-level computations such as vehicle speed aggregation. honeycomb_paths_dict.pklStores, for each grid (honeycomb) entity, a path composed of road entities that are connected by touch relationships within the grid. This file is used to accelerate path-based queries and analyses involving connected road segments inside individual grid entities. 6. Traffic Speed Prediction Data (Regional Scale: Huangpu District, Shanghai) All prediction-related data are stored in the prediction_data directory and are used exclusively for grid-level traffic speed prediction in the Huangpu District of Shanghai. feature_matrix_X.csvA feature matrix for traffic speed prediction in the Huangpu District, generated using data_manage.ipynb and interpolation.ipynb. adj_matrix.csvA road-based grid adjacency matrix for the Huangpu District, generated using data_manage2.ipynb and adj_create.ipynb. POI/Stores the number of different POI categories within each grid cell in the Huangpu District, generated by POI_class_num.ipynb. precipitation/Stores precipitation station information within the Huangpu District, generated by precipitation.ipynb. Naming Notes In the paper, the spatial unit is referred to as “grid”. Since the grid cells are hexagonal in shape, they are referred to as “honeycomb” in the dataset. In the paper, the concept of “state” is derived from mapped trajectory points. In the dataset, this concept is directly referred to as “trajectory_point”, reflecting its origin from trajectory data after map matching.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
