Downloads provided by UsageCounts
handle: 10261/137725
The anonymization of query logs is an important process that needs to be performed prior to the publication of such sensitive data. This ensures the anonymity of the users in the logs, a problem that has been already found in released logs from well known companies. This paper presents the anonymization of query logs using microaggregation. Our proposal ensures the k-anonymity of the users in the query log, while preserving its utility. We provide the evaluation of our proposal in real query logs, showing the privacy and utility achieved, as well as providing estimations for the use of such data in data mining processes based on clustering. © 2011 Elsevier Ltd. All rights reserved.
Partial support by the Spanish MICINN (Projects eAEGIS TSI2007-65406-C03-02, TSI2007-65406-C03-01, ARES-CONSOLIDER INGENIO 2010 CSD2007-00004, Audit Transparency Voting Process PT-430000-2010-31, and N-KHRONOUS TIN2010-15764), the Spanish Ministry of Industry, Commerce and Tourism (Project TSI-020100-2009-720 and SeCloud TSI-020302-2010-153), and the Government of Catalonia (Grant 2009 SGR 1135) is acknowledged. G. Navarro-Arribas enjoys a Juan de la Cierva Grant (JCI-2008-3162) from the Spanish MICINN.
Peer Reviewed
Privacy, K-Anonymity, Microaggregation, Query log, Web search, Clustering, k-Anonymity
Privacy, K-Anonymity, Microaggregation, Query log, Web search, Clustering, k-Anonymity
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 34 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
| views | 36 | |
| downloads | 15 |

Views provided by UsageCounts
Downloads provided by UsageCounts