Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2021
Data sources: Datacite
versions View all 4 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

A dataset of Vulnerable Code Changes of the Chormium OS project

Authors: Paul, Rajshakhar; Turzo, Asif Kamal; Bosu, Amiangshu;

A dataset of Vulnerable Code Changes of the Chormium OS project

Abstract

This dataset is associated with the paper ""Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS Project" To cite this dataset please use following: @inproceedings{paul-2021-ICSE, author = {Paul, Rajshakhar and Turzo, Asif K. and Bosu, Amiangshu}, title = {Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS Project}, booktitle = {Proceedings of the 43th International Conference on Software Engineering}, series = {ICSE'21}, year = {2021}, location = {Madrid, Spain}, pages = {TBD}, note={Acceptance rate = 138/602 (22%)}, } ---------------------------------------------------------------------------------------------------------------- We conducted a case-control study of Chromium OS project to identify the factors that differentiate code reviews that successfully identified security defects from those that missed such defects. We identified the cases and the controls based on our outcome of interest, namely whether a security defect was identified or escaped during the code review of a vulnerability contributing commit (VCC). Using a keyword-based mining approach followed by manual validations on a dataset of 404,878 Chromium OS code reviews, we identified 516 code reviews that successfully identified security defects. In addition, from the Chromium OS bug repository, we identified 239 security defects that escaped code reviews. Using a modified version of the SZZ algorithm followed by manual validations, we identified 374 VCCs and corresponding code reviews that approved those changes. For each of the 890 identified VCCs, we computed 25 different attributes that may influence the identification of a vulnerability during code reviews. Our artifact includes locations of these 890 VCCs as well as the 25 attributes for each VCC. Among those 25 attributes, we considered 18 attributes to build our model. To analyze our data, we developed a Logistic Regression model following the guidelines suggested by Harrell Jr. using the 18 attributes. The model, which achieved an AUC of 0.91, found nine code review metrics that distinguish code reviews that missed a vulnerability from the ones that did not. We have also made the R script to reproduce our results in the Github repository. Our dataset includes 890 real world vulnerable files. Each of those vulnerabilities is classified using the CWE specification. We envision several ways researchers may benefit from this artefact such as evaluating static analysis tools, training machine learning models, and replicating this study under a different context.

Related Organizations
Keywords

code review, vulnerability, security

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 47
    download downloads 19
  • 47
    views
    19
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
47
19