
This is the first automatic transcription of the entire collection of digital images of the Geniza at the National Library of Israel as of this date. It was created using kraken version 5.3.1.dev56. To find a fragment put the 99 ID number into KTIV. We are aware that this is a very preliminary and imperfect result, which we are releasing now because of the high value for scholarship even in its current form. We are aware of the following misgivings: Obviously there are segmentation and text recognition mistakes. Some texts have wrong reading order where the left region region precedes the right. Vertical text has mostly been ignored. Many images with 3 or 4 parallel text regions only have the outer ones. Arabic script recognition is less good than Hebrew script. The three steps encompassed a) an image classifier to choose the best layout segmentation and recognition models. https://edizionicafoscari.it//it/edizioni/riviste/magazen/2024/2/netlay-layout-classification-dataset-for-enhancing/#d670e63 b) Region and line segmentation with kraken c) Text recognition with kraken Funded by the European Union (ERC, MiDRASH, Project No. 101071829). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.
fragmentology, kraken, Judeo-Arabic, Cairo Geniza, manuscripts, Aramaic, Digital Humanities, OCR, ATR, Computational Humanities, Hebrew, HTR, Jewish Studies, layout segmentation, image classification
fragmentology, kraken, Judeo-Arabic, Cairo Geniza, manuscripts, Aramaic, Digital Humanities, OCR, ATR, Computational Humanities, Hebrew, HTR, Jewish Studies, layout segmentation, image classification
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
