Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Project deliverable . 2024
License: CC BY
Data sources: Datacite
ZENODO
Project deliverable . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

CLS INFRA D7.3 On Versioning Living and Programmable Corpora

(Executable) Report and Prototypes for Reproducible Research
Authors: Börner, Ingo; Trilcke, Peer;

CLS INFRA D7.3 On Versioning Living and Programmable Corpora

Abstract

Digital corpora, which are proving more and more to be the most important epistemic objects of Computational Literary Studies (CLS), are by no means always static objects. On the contrary, it is becoming increasingly clear that the digitization of our cultural heritage needs to be understood as an ongoing process, which also implies that a number of the epistemic objects of CLS must be conceptualized as genuinely dynamic. We address this specific quality of some epistemic objects of the CLS by speaking of “living corpora”. Where corpora — as the data of CLS — are also conceptually combined with code (e.g. in the form of an API) to form more complex research artifacts, we speak of "programmable corpora", as described in detail in CLS INFRA Deliverable D7.1 “On Programmable Corpora”.However, both living and programmable corpora usually face a considerable problem when discussed with regard to the reproducibility of research. This report considers possible solutions for the stabilization of living and programmable corpora and thus shows ways of making them available for reproducing research in a sustainable and long-term manner.By recommending Git commits as a way for versioning living corpora, we rely on a well-established and proven tool for distributed version control, which, as we show using a concrete example, can also be used for living corpora. This also offers the possibility of retrieving additional (technical and performative) metadata about corpora.For the more complex programmable corpora, on the other hand, we recommend the containerization of the entire research infrastructure.In a broader sense, this report is also an exploration of the traces left by a living corpus in the technical space of a Git-based version control system. The traces are recovered using a method that we call “algorithmic corpus archaeology” – a method which we recommend to all those who embark on the epistemological adventure of working with living and programmable corpora.

Related Organizations
Keywords

Docker, Corpora, Computation Literary Studies, DraCor, Reproducible Research, Programmable Corpora

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!