
This dataset provides real-world Round-Trip Time (RTT) latency measurements collected from a geographically distributed set of probing nodes (Monitors) to target IP addresses across Europe. It is intended to support machine learning research in IP geolocation, enabling both model training and performance evaluation through curated datasets. Measurements were collected using ICMP echo requests during a campaign spanning from November 27, 2024 to January 30, 2025. Scenario Description The measurements were gathered from six virtual machines acting as Monitors, strategically deployed across Azure regions in Europe: Madrid (Spain), Dublin (Ireland), Frankfurt (Germany), Warsaw (Poland), Gävle (Sweden), and Milan (Italy). These Monitors probed two categories of destinations: · Landmarks: Nodes with known and verified geographical coordinates, used for model training. · Targets: Nodes also with known coordinates, used exclusively for validation but treated as unknown during inference. The RTT data are structured as fingerprint vectors, where each vector consists of latency statistics from all six Monitors to a specific target IP in a given measurement instance. These vectors incorporate multiple RTT-based features such as mean, geometric mean, and standard deviation values. Dataset Structure Each dataset contains multiple rows, where each row represents a RTT fingerprint vector consisting of latency measurements from multiple Monitors to a given node. 1) Learning Dataset: Landmark_RTTfingerprint_dataset.csv Monitors deployed: 6 (distributed across Microsoft Azure regions in Madrid, Dublin, Frankfurt, Warsaw, Gävle, and Milan). Landmarks: nodes with known geographical locations, used for training models. Columns: o measure_id: Unique identifier for each measurement. o landmark_id: ID of the geolocated node used for training. o landmark_type: Type of landmark (dns, ripe_anchor, ripe_probe). o dst_ip: IP address of the landmark node. o init_time: Timestamp of the measurement. o country_code_gt: Ground truth country code of the node. o latitude_gt, longitude_gt: Ground truth geolocation of the landmark node. o 4h_time_slot, 6h_time_slot: Time window identifiers indicating when the measurement was taken. o mean_latency_m1 – mean_latency_m6: Mean RTT fingerprint vector from 6 different Monitors (milliseconds). o geomean_latency_m1 – geomean_latency_m6: Geometric mean RTTs from 6 Monitors (milliseconds). o std_latency_m1 – std_latency_m6: Standard deviation of RTTs from 6 Monitors (milliseconds). 2) Validation Dataset: ValidationDataset_RTT_dispersed_EU.csv) Monitors deployed: 6 (same as Learning Dataset). Targets: nodes used to evaluate model performance. Their actual locations are known but treated as unknown during inference. Columns: o measure_id: Unique identifier for each measurement. o target_id: ID of the target node used for validation. o target_type: Type of target (dns, ripe_anchor, ripe_probe). o dst_ip: IP address of the target node. o init_time: Timestamp of the measurement. o country_code_gt: Ground truth country code of the node. o latitude_gt, longitude_gt: Ground truth geolocation of the landmark node. o 4h_time_slot, 6h_time_slot: Time window identifiers indicating when the measurement was taken. o mean_latency_m1 – mean_latency_m6: Mean RTT fingerprint vector from 6 different Monitors (milliseconds). o geomean_latency_m1 – geomean_latency_m6: Geometric mean RTTs from 6 Monitors (milliseconds). o std_latency_m1 – std_latency_m6: Standard deviation of RTTs from 6 Monitors (milliseconds).
fingerprinting, RTT latency, IP geolocation
fingerprinting, RTT latency, IP geolocation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
