
This dataset contains a computationally enriched edition of two foundational texts of Roman Stoicism: Epictetus’s Enchiridion and Marcus Aurelius’s Meditations. The data was produced using a novel "Deep Matching" framework that bridges the gap between raw Ancient Greek source material and technical English philosophical translation. The dataset is provided in a cleaned, machine-readable JSON format, suitable for immediate use in thematic lexical analysis, natural language processing (NLP), or educational application development. Methodology The data was generated through a four-stage pipeline: Source Acquisition: Greek text (edition perseus-grc2) was harvested from the Scaife Viewer CTS API. Contextual Translation: English translations and commentary were generated using the Gemini 3 Pro reasoning model, employing a 150-character sliding context window to maintain narrative continuity across chapter boundaries. Lexical Lemmatization: Passages were enriched using the Classical Language Toolkit (CLTK) to map 45 core Stoic technical terms (e.g., prohairesis, logos, eph’ hēmin) back to their original Greek lemmas, regardless of grammatical inflection. Refinement: Automated post-processing was applied to remove model artifacts and ensure structural integrity for production environments. Contents enchiridion_final_clean.json: The complete Enchiridion (53 chapters) with Greek text, English translation, Stoic notes, and thematic lexical tags. meditations_final_clean.json: The complete Meditations (12 books) with Greek text, reflective English translation, commentary, and thematic lexical tags. Processing Scripts: The suite of Python scripts used for fetching, translating, tagging, and cleaning the data. Use Cases Thematic Research: Quantitative analysis of technical Stoic vocabulary distribution. Educational Tools: Development of interactive readers that highlight the relationship between original Greek and modern translation. CitationMoss, W. (2026). Data Paper: The Digital Stoic Library. Knowledge Commons. https://doi.org/10.5281/zenodo.18273320
Digital Humanities, Stoicism, Marcus Aurelius, Ancient Greek, Epictetus, NLP, Lemmatization
Digital Humanities, Stoicism, Marcus Aurelius, Ancient Greek, Epictetus, NLP, Lemmatization
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
