
doi: 10.3233/jcs-130491
Web Search is one of the most rapidly growing applications on the internet today. However, the current practice followed by most search engines – of logging and analyzing users' queries – raises serious privacy concerns. In this paper, we concentrate on two existing solutions which are relatively easy to deploy – namely Query Obfuscation and Anonymizing Networks. In query obfuscation, a client-side software attempts to mask real user queries via injection of certain noisy queries. Anonymizing networks route the user queries through a series of relay servers, hiding the actual query source from the search engine. A fundamental problem with these solutions, however, is that user queries are still obviously revealed to the search engine, although they are “mixed” among queries generated either by a machine or by other users. We focus on TrackMeNot (TMN), a popular query obfuscation tool, and the Tor anonymizing network, and try to analyse whether these solutions can actually preserve users' privacy in practice against an adversarial search engine. We demonstrate that a search engine, equipped with only a short-term history of a user's search queries, can break the privacy guarantees of TMN and Tor by only utilizing off-the-shelf machine learning techniques.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 14 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
