Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

Yarmouk Arabic OCR Dataset

Authors: Iyad Abu Doush; Faisal AIKhateeb; Anwaar Hamdi Gharibeh;

Yarmouk Arabic OCR Dataset

Abstract

Optical Character Recognition (OCR) is the process of recognizing characters automatically from scanned or image documents. OCR software uses machine learning to recognize characters in the document. Such software needs to pass a training phase to learn how to recognize the letters in the text. In order to implement the training phase the OCR needs to use a standard dataset. The dataset can be used to evaluate the obtained results. In this research, we propose an Arabic printed OCR dataset. To the best of our knowledge, there is no Arabic OCR dataset that is available to be used by the research community with its ground truth with a size that is suitable to build a robust Arabic OCR. The proposed dataset is extracted randomly from Wikipedia to have different topics. It consists of 4,587 Arabic articles with a total of 8,994 images.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    11
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
11
Top 10%
Top 10%
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!