
handle: 10419/229135
Many organisations collect data about their processes, customers, the use of their products, and many other topics in order to analyse these data in the context of data mining, big data, machine learning, and similar approaches. However, in most cases such data refer to individual people, and the persons concerned rightly expect that their data are protected adequately and kept private. The resulting limitations on the use of the data often lead to conflicts with and limitations of the data analysis. A technique that helps to overcome these conflicts and limitations is the anonymisation of the data, modifying the data in such a way that they no longer refer to individuals but still allow certain forms of analysis. However, anonymising data turns out to be far more complex than just removing names and other identifiers, and there are many examples where apparently anonymised data were de-anonymised and could be assigned to the individuals concerned after all. Therefore, a number of systematic techniques for evaluating and achieving anonymity, such as k-anonymity and differential privacy, have been developed for this purpose. The current report therefore gives a first overview of the concept of anonymisation, the remaining threats to anonymity, and the main approaches used for anonymising data. The paper concludes with a summary of open research questions for further work.
ddc:004, anonymisation, data protection, anonymity, differential privacy, privacy
ddc:004, anonymisation, data protection, anonymity, differential privacy, privacy
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
