
Reproducible supplements to the paper titled "Towards Long-term and Archivable Reproducibility", submitted to Computing in Science and Engineering (CiSE) journal (currently available as pre-print in arXiv:2006.03018). This paper proposes some criteria to ensure long-term reproducibility. As a proof of concept, this paper introduces Maneage (Managing data lineage, hosted at maneage.org). This repository contains all the necessary components (listed below) to exactly reproduce this paper. This paper is written in Maneage itself. For more information on how to reproduce this paper, see the README.md and README-hacking.md files of maneaged-991f4c2.tar.gz. The version string 991f4c2 (which is imprinted on some file names and hard-coded in others), represents the Git commit used for this publication. The full Git history (possibly including modifications after this publication) is also available at https://gitlab.com/makhlaghi/maneage-paper. A brief description of the contents in this repository: paper-991f4c2.pdf: The complete PDF of the paper, containing a full narrative of the project. maneaged-991f4c2.tar.gz: A snapshot/checkout of the published pipeline's raw (plain text) source, that doesn't need Git to open, read or execute. This tarball is mainly tailored to producing the PDF (but has all the other components also), hence it also contains all the project outputs that are used in the paper (figures and tables). It is the tarball that was uploaded to arXiv. Hence, its not the raw source, but can be used never-the-less with minor modifications as described in README.md. project-git.bundle: A Git bundle of the project's history until Commit 991f4c2. To get the full history, simply clone the bundle (replacing the downloaded bundle's filename on your computer by the URL of a repository). Archiving the full git history here beside the project's inputs and outputs allows long-term preservation beyond personal Git repositories (which may be deleted unexpectedly, or the repository host may become defunct). software-991f4c2.tar.gz: Tarballs/source-code of all the software that were used in this project. Preserving the software source with the data allows this project to be self-sufficient, in case their hosting service becomes unavailable. They are all free software. This version's tarball fixes the bug that was present in the previous version and mentioned there. tools-per-year.txt: Output data files that are used in the demonstration plot of the paper (showing the number of papers studied by Menke et. al 2020 in every year and the number of papers mentioning software tools). The Creative Commons copyright mentioned in the Zenodo webpage is only applicable to files that don't have an explicit copyright within them. The copyright of other files (mainly scripts and software) is mentioned within them (all are free licenses, primarily the GNU General Public License v3+). For any issues or questions on this project, please contact Mohammad Akhlaghi.
Scientific pipelines, Provenance, Data Lineage, Reproducibility, Workflows
Scientific pipelines, Provenance, Data Lineage, Reproducibility, Workflows
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
