
We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k-ANONYMITY problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. Building on previous work and coping with corresponding deficiencies, we describe an enhanced k-anonymization problem called PATTERN-GUIDED k-ANONYMITY, where the users specify in which combinations suppressions may occur. In this way, the user of the anonymized data can express the differing importance of various data features. We show that PATTERN-GUIDED k-ANONYMITY is NP-hard. We complement this by a fixed-parameter tractability result based on a “data-driven parameterization” and, based on this, develop an exact integer linear program (ILP)-based solution method, as well as a simple, but very effective, greedy heuristic. Experiments on several real-world datasets show that our heuristic easily matches up to the established “Mondrian” algorithm for k-ANONYMITY in terms of the quality of the anonymization and outperforms it in terms of running time.
exact algorithms, Data encryption (aspects in computer science), Industrial engineering. Management engineering, Parameterized complexity, tractability and kernelization, Integer programming, QA75.5-76.95, heuristics, experiments, T55.4-60.8, integer linear programming, Electronic computers. Computer science, NP-hardness, Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.), parameterized complexity
exact algorithms, Data encryption (aspects in computer science), Industrial engineering. Management engineering, Parameterized complexity, tractability and kernelization, Integer programming, QA75.5-76.95, heuristics, experiments, T55.4-60.8, integer linear programming, Electronic computers. Computer science, NP-hardness, Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.), parameterized complexity
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 6 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
