Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2026
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
addClaim

Dataset for the paper "Software Documentation References to Unmaintained Repositories: An Empirical Study", ICSME 2026

Dataset for the paper "Software Documentation References to Unmaintained Repositories: An Empirical Study", ICSME 2026

Abstract

Dataset for the paper "Software Documentation References to Unmaintained Repositories: An Empirical Study" submitted to ICSME 2026: Dataset 1: GitHub repositories referenced in websites: This dataset contains all GitHub repositories referenced in websites. It comprises 51,265 distinct GitHub repositories that are referenced in the analyzed websites. Dataset 2: Websites referencing GitHub repositories: This dataset contains 2,070 websites, each containing at least one reference to a GitHub repository. On the median, the 2,070 websites' repositories have 17.8K stars, 191 contributors, 121K LOC, 1.6K issues, 3.8K commits, and 69 releases. Dataset 3: Software websites referencing GitHub repositories: This dataset contains websites of the top 100 most starred repositories that are real software systems and have at least 10 references to GitHub repositories. This threshold of 10 was adopted to filter out websites that less frequently reference GitHub repositories that are not in the scope of this analysis. On the median, the 100 websites' repositories have 68.3K stars, 394 contributors, 400K LOC, 713 issues, 14.8K commits, and 190 releases. The top 3 most starred are: React, TensorFlow, and Microsoft VSCode. Dataset 4: Software documentation referencing GitHub repositories: Starting from Dataset 3, we manually inspected all webpages and selected those related to software documentation, such as learning guides, tutorials, and API references. This filtering step was conducted to remove noisy pages, including translations, outdated documentation, demos, and datasets. As a result, this manual assessment yielded 1,351 webpages containing software documentation. repos-ghs: 2,617 repositories with websites from SEART GitHub Search Engine (seart-ghs).

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average