Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
versions View all 3 versions
addClaim

Topics API Analysis

Authors: Nunes, Gabriel Henrique;

Topics API Analysis

Abstract

Topics API Analysis This repository provides the experimental results of the paper The Privacy-Utility Trade-off in the Topics API. Usage The notebooks were run using: Python v3.11.8 bvmlib v1.0.0 matplotlib 3.8.0 numpy 1.24.3 pandas 2.0.1 qif 1.2.3 requests 2.31.0 scipy 1.11.3 tldextract 5.1.2 tqdm 4.66.1 urllib3 1.26.16 The datasets produced for the experiments can be found on Zenodo: AOL Dataset for Browsing History and Topics of Interest (DOI: 10.5281/zenodo.11029572). Notebooks Data treatment: AOL-data-treatment.ipynb: Converts the original AOL dataset. Treats inconsistencies; Randomly remaps AnonID to RandID; Defines domains from URLs; and Filters domains by eTLD using tldextract and Mozilla's Public Suffix List, as of commit 5e6ac3a, extended by the discontinued TLDs: .bg.ac.yu, .ac.yu, .cg.yu, .co.yu, .edu.yu, .gov.yu, .net.yu, .org.yu, .yu, .or.tp, .tp, and .an. Generates the datasets AOL-treated.csv and AOL-treated-unique-domains.csv. The dataset AOL-treated.csv can be used for analyses of browsing history vulnerability and utility, as enabled by third-party cookies. This dataset contains singletons (individuals with only one domain in their browsing histories) and one outlier (one user with 150.802 domain visits in three months) that are dropped in some analyses. Citizen-Lab-Classification-data-treatment.ipynb: Converts the Citizen Lab Classification data, as of commit ebd0ee8. Treats inconsistencies; Defines domains from URLs; Filters domains by eTLD using tldextract and Mozilla's Public Suffix List, as of commit 5e6ac3a, extended by the discontinued TLDs: .bg.ac.yu, .ac.yu, .cg.yu, .co.yu, .edu.yu, .gov.yu, .net.yu, .org.yu, .yu, .or.tp, .tp, and .an; and Merges classifications by domain. Generates the dataset Citizen-Lab-Classification.csv. AOL-treated-Citizen-Lab-Classification-domain-matching.ipynb: Matches domains from AOL-treated-unique-domains.csv with domains and respective topics from Citizen-Lab-Classification.csv. Generates the dataset AOL-treated-Citizen-Lab-Classification-domain-match.csv. AOL-treated-Google-Topics-Classification-v1-domain-matching.ipynb: Matches domains from AOL-treated-unique-domains.csv with domains and respective topics from Google-Topics-Classification-v1.txt, as provided by Google with the Chrome browser. Generates the dataset AOL-treated-Google-Topics-Classification-v1-domain-match.csv. AOL-reduced-Citizen-Lab-Classification.ipynb: Converts the dataset AOL-treated.csv. Reduces the dataset AOL-treated.csv according to the dataset AOL-treated-Citizen-Lab-Classification-domain-match.csv. Generates the dataset AOL-reduced-Citizen-Lab-Classification.csv. The dataset AOL-reduced-Citizen-Lab-Classification.csv can be used for analyses of browsing history vulnerability and utility, as enabled by third-party cookies, and for analyses of topics of interest vulnerability and utility, as enabled by the Topics API. This dataset contains singletons and the outlier that are dropped in some analyses. This dataset can be used for analyses including the (data-dependent) randomness of trimming-down or filling-up the top-s sets of topics for each individual so each set has s topics. Privacy results for Generalization and utility results for Generalization, Bounded Noise, and Differential Privacy are expected to slightly vary with each run of the analyses over this dataset. AOL-reduced-Google-Topics-Classification-v1.ipynb: Converts the dataset AOL-treated.csv. Reduces the dataset AOL-treated.csv according to the dataset AOL-treated-Google-Topics-Classification-v1-domain-match.csv. Generates the dataset AOL-reduced-Google-Topics-Classification-v1.csv. The dataset AOL-reduced-Google-Topics-Classification-v1.csv can be used for analyses of browsing history vulnerability and utility, as enabled by third-party cookies, and for analyses of topics of interest vulnerability and utility, as enabled by the Topics API. This dataset contains singletons and the outlier that are dropped in some analyses. This dataset can be used for analyses including the (data-dependent) randomness of trimming-down or filling-up the top-s sets of topics for each individual so each set has s topics. Privacy results for Generalization and utility results for Generalization, Bounded Noise, and Differential Privacy are expected to slightly vary with each run of the analyses over this dataset. AOL-experimental.ipynb: Converts the dataset AOL-treated.csv. Drops singletons (individuals with only one domain in their browsing histories) and one outlier (one user with 150.802 domain visits in three months); and Defines browsing histories. Generates the dataset AOL-experimental.csv. The dataset AOL-experimental.csv can be used to empirically verify code correctness. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. AOL-experimental-Citizen-Lab-Classification.ipynb: Converts the dataset AOL-reduced-Citizen-Lab-Classification.csv. Generates the dataset AOL-experimental-Citizen-Lab-Classification.csv. The dataset AOL-experimental-Citizen-Lab-Classification.csv can be used to empirically verify code correctness. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. AOL-experimental-Google-Topics-Classification-v1.ipynb: Converts the dataset AOL-reduced-Google-Topics-Classification-v1.csv. Generates the dataset AOL-experimental-Google-Topics-Classification-v1.csv. The dataset AOL-experimental-Google-Topics-Classification-v1.csv can be used to empirically verify code correctness. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. Analyses: QIF-analyses-AOL-treated.ipynb: QIF analyses based on the dataset AOL-treated.csv. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. QIF-analyses-AOL-reduced-Citizen-Lab.ipynb: QIF analyses based on the dataset AOL-reduced-Citizen-Lab-Classification.csv. Privacy results for Generalization and utility results for Generalization, Bounded Noise, and Differential Privacy are expected to slightly vary with each run of the analyses over this dataset. QIF-analyses-AOL-reduced-Google-Topics-v1.ipynb: QIF analyses based on the dataset AOL-reduced-Google-Topics-Classification-v1.csv. Privacy results for Generalization and utility results for Generalization, Bounded Noise, and Differential Privacy are expected to slightly vary with each run of the analyses over this dataset. QIF-analyses-counting-experiment.ipynb: QIF analysis for counting topics popularity using the binomial distribution. QIF-analyses-AOL-experimental.ipynb: QIF analyses based on the dataset AOL-experimental.csv. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. QIF-analyses-AOL-experimental-Citizen-Lab.ipynb: QIF analyses based on the dataset AOL-experimental-Citizen-Lab-Classification.csv. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. QIF-analyses-AOL-experimental-Google-Topics-v1.ipynb: QIF analyses based on the dataset AOL-experimental-Google-Topics-Classification-v1.csv. All privacy and utility results are expected to remain the same with each run of the analyses over this dataset. License GNU GPLv3. To understand how the various GNU licenses are compatible with each other, please refer to the GNU licenses FAQ.

Keywords

Topics API, Third-Party Cookies, QIF, Microdata, Quantitative Information Flow

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average