Prink: $$k_s$$-Anonymization for Streaming Data in Apache Flink

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Article , Preprint , Conference object 01 Jan 2025Embargo end date: 01 Jan 2025 English Publisher:Springer Nature Switzerland

Authors: Philip Groneberg; Saskia Nuñez von Voigt; Thomas Janke; Louis Loechel; Karl Wolf; Elias Grünewald; Frank Pallas;

doi: 10.1007/978-3-032-00624-0_2 , 10.48550/arxiv.2505.13153

arXiv: 2505.13153

Prink: $$k_s$$-Anonymization for Streaming Data in Apache Flink

- Summary
- Subjects
- Metrics

Abstract

In this paper, we present Prink, a novel and practically applicable concept and fully implemented prototype for ks-anonymizing data streams in real-world application architectures. Building upon the pre-existing, yet rudimentary CASTLE scheme, Prink for the first time introduces semantics-aware ks-anonymization of non-numerical (such as categorical or hierarchically generalizable) streaming data in a information loss-optimized manner. In addition, it provides native integration into Apache Flink, one of the prevailing frameworks for enterprise-grade stream data processing in numerous application domains. Our contributions excel the previously established state of the art for the privacy guarantee-providing anonymization of streaming data in that they 1) allow to include non-numerical data in the anonymization process, 2) provide discrete datapoints instead of aggregates, thereby facilitating flexible data use, 3) are applicable in real-world system contexts with minimal integration efforts, and 4) are experimentally proven to raise acceptable performance overheads and information loss in realistic settings. With these characteristics, Prink provides an anonymization approach which is practically feasible for a broad variety of real-world, enterprise-grade stream processing applications and environments.

accepted for ARES 2025

Related Organizations

Charité - University Medicine Berlin
Germany
University of Salzburg
Austria
Technical University of Berlin
Germany

Keywords

Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering, Computer Science - Cryptography and Security, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC), Cryptography and Security (cs.CR)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Top 10%

Average

Green