publication . Conference object . 2010

A study on the re-identifiability of Dutch citizens

Koot, M.R.; van 't Noordende, G.; de Laat, C.; Serjantov, A.; Troncoso, C.;
Open Access English
  • Published: 01 Jan 2010
This paper analyses the re-identifiability of Dutch citizens by various demographics. Our analysis is based on registry office data of 2.7 million Dutch citizens, ~16% of the total population. We provide overall statistics on re-identifiability for a range of quasi-identifiers, and present an in-depth analysis of quasi-identifiers found in two de-identified data sets. We found that 67.0% of the sampled population is unambiguously identifiable by date of birth and four-digit postal code alone, and that 99.4% is unambiguously identifiable if date of birth, full postal code and gender are known. Furthermore, two quasi-identifiers we examined from real-life data set...
Download from

1. Tieto Netherlands Healthcare BV. LMR Gebruikershandleiding, 2009.

2. CBS. Documentatierapport Landelijke Medische Registratie (LMR) 2005V1, March 2007.

3. CBS. Documentatierapport Bijstandsfraudestatistiek (BFS) 200901-06V1, November 2009.

4. CBS. Documentatierapport Landelijke Medische Registratie (LMR) 2007V1, July 2009.

5. CBS. Website: Cbs - gemeentelijke indeling op 1 januari 2009, 2009.

6. CBS. Website: Cbs - ziekenhuisopnamen - dataverzameling, 2009.

7. Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In 23rd International Conference on Data Engineering, pages 106{115, 2007.

8. Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasubramaniam. L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data, 1(1):3, 2007.

9. Arvind Narayanan and Vitaly Shmatikov. Robust de-anonymization of large sparse datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy, pages 111{125, Washington, DC, USA, 2008. IEEE Computer Society.

10. Atzo Nicola. Kst99754: Modernisering gemeentelijke basisadministratie persoonsgegevens, 2006.

11. NVVB. Schema voor schriftelijke verzoeken om gegevensverstrekking uit de GBA, January 2010.

12. Andreas P tzmann and Marit Hansen. A terminology for talking about privacy by data minimization: Anonymity, unlinkability, undetectability, unobservability, pseudonymity, and identity management. Terminology.shtml, December 2009. v0.32.

13. Latanya Sweeney. Uniqueness of simple demographics in the u.s. population, 2000.

14. Latanya Sweeney. Computational disclosure control: a primer on data privacy protection. PhD thesis, Massachusetts Institute of Technology, 2001. Supervisor: Abelson, Hal.

15. Leon Willenborg and Ton de Waal. Statistical Disclosure Control in Practice, volume 111 of Lecture Notes in Statistics. Springer, 1996. ISBN: 978-0-387-94722- 8.

Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue