Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2026
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

HTR Winter School 2025 - Syriac, MS Jerusalem, Saint Mark's Monastery 36

Authors: Ishac, Ephrem Aboud; Roughan, Christine; Balski, Julius; Jamali, Nima; Kilicci, Jeanette; Lin, Chia-Wei; Macar, Andrei; +5 Authors

HTR Winter School 2025 - Syriac, MS Jerusalem, Saint Mark's Monastery 36

Abstract

Ground truth of 133 bifolio images of MS Jerusalem, Saint Mark's Monastery 36. This ground truth was produced by participants of the Vienna 2025 HTR Winter School, who used Transkribus to manually correct a preliminary automatic transcription that had been generated using a Kraken model (doi.org/10.5281/zenodo.17406773). Description Jerusalem, Saint Mark's Monastery, MS 36 Syriac, primarily Estrangelo but with Serto and Eastern features Codex approximately 12th - 14th century Scribe uncertain, perhaps the otherwise unknown Elias or Giwargis Origin of the data We are thankful to the St Mark's Syrian Orthodox Monastery - Jerusalem for providing us with the digital images of MS 36 and for allowing us to use and share these images to support research with Syriac handwritten text recognition. An online digitization of the manuscript may also be viewed in the virtual reading room of the Hill Museum & Manuscript Library at the shelfmark SMMJ 00036. Segmentation and Transcription guidelines The segmentation of the folios followed the SegmOnto vocabulary for annotation of regions: MainZone: the main column of text. MarginTextZone: any marginal words or phrases, including catchwords. Also used for interlinear glosses. NumberingZone: any page or folio numbers. The transcription guidelines included spaces, the Syriac letters, some diacritics, punctuation, and no vowel dots or markings. Allowed diacritics: Syome Dots over feminine suffix heh Dots in pronouns: above for demonstrative, below for personal Dots in verbs: to distinguish participles and perfects Dots to distinguish homographs Excluded diacritics: Vowel dots Dots of hardening and softening (qushoyo and rukokho) Punctuation marks were not normalized, but rather transcribed as they appear in the manuscript (. ܆ ܇ : ܀). Transkribus's unclear tag was used when readings were uncertain or the text was damaged or unclear. There is additionally some use of the sic and variant tags in the corpus, but these were not applied consistently. Data organisation CITATION.cff htr-united.yml alto.zip: the ground truth in ALTO XML format, exported from eScriptorium page.zip: the ground truth in PAGE XML format, exported from Transkribus images.zip: the corresponding image files Copyright and licence This dataset was created as part of the Winter School of Handwritten Text Recognition of Medieval Manuscripts 2025, Vienna at the Österreichische Akademie der Wissenschaften, Institut für Mittelalterforschung, all transcriptions are licensed under the Creative Commons 4 licence. Images were provided by the St Mark's Syrian Orthodox Monastery - Jerusalem and are licensed under Creative Commons 4 licence.

Related Organizations
Keywords

Syriac, handwritten text recognition, Transkribus, HTR, ground truth, eScriptorium

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average