
This dataset captures the evolution of QUIC traffic in an ISP network over a year-long collection period. Its goal is to provide a basis for studies of QUIC deployment as well as experiments in encrypted traffic classification. Collected at: CESNET3 network - https://www.cesnet.cz/en/sit-cesnet3-engSampling rate: uniform 1:100Software used: ipfixprobe flow exporter - https://github.com/CESNET/ipfixprobe ipfixcol2 flow collector - https://github.com/CESNET/ipfixcol2 QUIC plugin for extracting extended metadata about QUIC connections - https://github.com/CESNET/ipfixprobe/tree/master/src/plugins/process/quic Parts of the dataset (specifically the July 2024 and April 2025 data) were used in the following publication: Waiting for QUIC: Passive Measurements to Understand QUIC Deployments. Jonas Mücke et al. 2025. Proc. ACM Netw. 3, CoNEXT4, Article 41 (September 2025), 26 pages. https://doi.org/10.1145/3768988 We are preparing a short paper that describes the dataset in more detail. Until its publication, please cite the above-mentioned paper by Mücke et al. when using the dataset. Dataset structure The dataset is organized into 12 per-month ZIP files covering the period from June 2024 to May 2025. Each ZIP file contains a Parquet file for every day of the month, except for several dates affected by data outages, which are listed below. Each sample in the dataset represents a bidirectional flow record describing an observed QUIC connection, with the available fields detailed below. Available fields DST_IP: An anonymized identifier of the destination hostDST_IP_SUBNET: An anonymized identifier of the destination host subnet (a /24 prefix for IPv4 and a /64 prefix for IPv6)DST_IP_VERSION: IP version (IPv4 or IPv6)DST_ASN: Autonomous System Number of the destination host DST_COUNTRY: Country of the destination host, derived from a geolocation database DST_PORT: Destination port PROTOCOL: Protocol used (UDP for all samples)TIME_FIRST: Time of the first packetTIME_LAST: Time of the last packet DURATION: Duration of the flow in seconds FLOW_END_REASON: Flow termination reason, using values assigned by IANABYTES: Number of bytes transmitted from client to serverBYTES_REV: Number of bytes transmitted from server to clientPACKETS: Number of packets sent from client to serverPACKETS_REV: Number of packets sent from server to clientQUIC_VERSION: QUIC version from the first server long-header packetQUIC_CLIENT_VERSION: QUIC version from the first client long-header packetQUIC_TOKEN_LENGTH: Token length from an Initial or Retry packetQUIC_MULTIPLEXED: Indicates whether multiplexing occurred (value > 0 if at least two distinct QUIC_OSCID values were observed)QUIC_ZERO_RTT: Number of 0-RTT packets observed in the flowQUIC_OCCID: Original client Connection ID from the first client packetQUIC_OSCID: Original server Connection ID from the first client packetQUIC_SCID: Server Connection IDQUIC_RETRY_SCID: Server Connection ID from a Retry packetQUIC_SNI: Server Name Indication domainQUIC_USER_AGENT: User-Agent string, if available in an Initial packetQUIC_TLS_EXT_TYPE: List of TLS extensions usedQUIC_TLS_EXT_LEN: Corresponding lengths of the listed TLS extensionsQUIC_PACKETS: Sequence of QUIC long-header packet types observed in the flowPPI: Packet sequence represented as [[inter-packet times], [packet directions], [packet sizes]]PPI_LEN: Number of packets in the PPI sequencePPI_DURATION: Duration of the PPI sequence in secondsPPI_ROUNDTRIPS: Number of roundtrips in the PPI sequencePHIST_SRC_SIZES: Histogram of packet sizes from client to serverPHIST_DST_SIZES: Histogram of packet sizes from server to clientPHIST_SRC_IPT: Histogram of inter-packet times from client to serverPHIST_DST_IPT: Histogram of inter-packet times from server to client Missing data 31.10.2024, with 30.10. and 1.11. showing reduced data volume. 11.12.2024–14.12.2024, with 10.12. and 15.12. showing reduced data volume. A smaller data outage in 15.3.2025–16.3.2025. Ethics The privacy of users is of utmost importance to us. We emphasize that the dataset does not include client IP addresses; therefore, it is not possible to trace the identity of data subjects. Moreover, the dataset consists solely of flow records—no payload data, apart from metadata available in QUIC handshakes, is included. The data was collected on the basis of a legitimate interest (i.e., not consent), among other things, for the purpose of ensuring the further development of services provided to the scientific and research community. We further applied the following anonymization measures: Destination IP addresses are hashed with a secret salt, transforming them into non-reversible identifiers. Flow start times are clipped to the hour, with end times adjusted accordingly. Source ports are omitted.
Network monitoring, Traffic classification, QUIC, Encrypted traffic
Network monitoring, Traffic classification, QUIC, Encrypted traffic
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
