A Peer-to-Peer Protocol and System Architecture for Privacy-Preserving Statistical Analysis
Conference object, Part of book or chapter of book
- Publisher: Springer International Publishing
[INFO] Computer Science [cs] | Privacy-preserving statistical analysis | [ INFO ] Computer Science [cs] | [ SHS.INFO ] Humanities and Social Sciences/Library and information sciences | Secure summation protocol | [SHS.INFO] Humanities and Social Sciences/Library and information sciences | Statistical processing of health records
Part 2: Special Session on Privacy Aware Machine Learning for Health Data Science (PAML 2016); International audience; The insights gained by the large-scale analysis of health-related data can have an enormous impact in public health and medical research, but access to such personal and sensitive data poses serious privacy implications for the data provider and a heavy data security and administrative burden on the data consumer. In this paper we present an architecture that fills the gap between the statistical tools ubiquitously used in medical research on the one hand, and privacy-preserving data mining methods on the other. This architecture foresees the primitive instructions needed to re-implement the elementary statistical methods so that they only access data via a privacy-preserving protocol. The advantage is that more complex analysis and visualisation tools that are built upon these elementary methods can remain unaffected. Furthermore, we introduce RASSP, a secure summation protocol that implements the primitive instructions foreseen by the architecture. An open-source reference implementation of this architecture is provided for the R language. We use these results to argue that the tension between medical research and privacy requirements can be technically alleviated and we outline a research plan towards a system that covers further requirements on computation efficiency and on the trust that the medical researcher can place on the statistical results obtained by it.