Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

Abbreviations in 13th-Century French Manuscripts: Statistical Analyses - Dataset

Authors: Stutzmann, Dominique; Mariotti, Viola; Ceresato, Floriana;

Abbreviations in 13th-Century French Manuscripts: Statistical Analyses - Dataset

Abstract

Dataset and code for : Dominique Stutzmann and Viola Mariotti, with collab. Floriana Ceresato, «Les abréviations dans les manuscrits français du XIIIe siècle: analyses statistiques», in The Rise of Vernacular Writing. The Palaeographical Perspective. Proceedings of the 21st Colloquium of the Comité international de paléographie latine. Firenze (19-21 February 2020), ed. Irene Ceccherini and Teresa De Robertis, Turnhout, Brepols, 2026 (Bibliologia, 70) This article examines thirteenth-century French abbreviation systems through statistical analysis of the ECMEN project corpus (IRHT, 2015-2019), comprising dated manuscripts from the BnF’s French collection (fr. 1-1000) with granular XML-TEI transcriptions. Our methodology separates graphic description from editorial interpretation, enabling systematic study of abbreviation practices across chronological, geographical, and generic contexts. Two case studies reveal underlying coherence in scribal practices: personal name abbreviations show remarkably unambiguous systems where scribes systematically avoid confusion through specialized usage; and the tilde -us (ꝰ), often considered polyvalent, proves largely univocal in practice, with apparent ambiguities actually serving disambiguating functions, particularly in Picard scripta. These findings demonstrate that medieval scribes developed sophisticated, internally coherent abbreviation systems adapted to vernacular linguistic realities. Rather than sources of confusion, abbreviations functioned as tools for disambiguation, suggesting the need to revise assumptions about the relationship between Latin and vernacular abbreviation systems. The paper was delivered in 2020, the final text of the article was submitted in April 2021. The data and code marginally differs from the one used for the publication (minor typo corrections and xslt replaced by python). FOLDER STRUCTURE /data/├── /orig/ Original XML-TEI files├── /TXM/ Tokenized files (used for the article)└── /tokenized/ Re-tokenized files (demonstration)/src/ XSLT transformation scripts/out/ Output files (statistics, figures) /data/orig/Original XML-TEI transcription files from specific GitHub commits:- Album_XIII.xml, ECMEN_ParisBnFMssFr.xml (ECMEN: https://github.com/oriflamms/ECMEN, commit 5ca5da9)- CMDF_1.xml, CMDF_5.xml, CMDF_6.xml (CMDF: https://github.com/oriflamms/CMDF, commit f5b7dcb) /data/TXM/Tokenized XML-TEI files used for the article. Tokenization performed with TXM software using Oriflamms XSLT transformations (Lavrentiev & Stutzmann, https://github.com/oriflamms). /data/tokenized/Re-tokenized files for demonstration purposes only. This tokenization is less refined than the TXM version and illustrates the processing pipeline using the XSLT stylesheets in /src/. /src/XSLT 2.0 stylesheets for word tokenization:- oriflamms-tokenize-words.xsl: splits text into and elements- oriflamms-patch-words-with-lb.xsl: handles words split across line breaks /out/Output files: statistics (CSV) and figures (PNG) on abbreviation rates in 13th-century French manuscripts.

Keywords

Linguistics/statistics & numerical data, Paleography, Literature, Medieval, History, Medieval

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average