
Corpus Litterarum is a line-based annotated dataset of Latin manuscript characters sampled from the Codices Sangallenses CSG 11 and CSG 70, provided by e-codices. Each line image has been annotated at the character level (73 classes) using Roboflow, with a semi-automatic workflow that combines manual annotation and model-assisted labelling. The dataset contains 2,152 line images and 44,407 annotations, distributed across predefined train/validation/test splits. Characters include standard Latin letters, abbreviations, and scribal signs, with suspensions left unresolved. The dataset supports research in palaeography, handwritten text recognition, and character segmentation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
