
This dataset provides a standardized, ready-to-use collection of 5,578 cropped, handwritten words extracted from physical medical prescriptions. It is explicitly designed to accelerate research and development in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) systems within the healthcare domain. 2 Version of Dataset: Original Raw Data (RxHandBD Original.zip) AI Compatible Data (RxHandBD.zip) Dataset Structure & Characteristics To facilitate immediate machine learning application, the dataset has been pre-organized into standard Training and Testing splits (an 80/20 ratio). All images are standardized to a 128x128 pixel resolution to ensure uniformity across neural network input layers. Total Images: 5,578 (.jpg format) Vocabulary: 1,559 unique text entries (including generic medicines, pharmaceutical brands, dosage forms, and clinical instructions). Training Set: 4,463 images (80% of the dataset) accompanied by train_labels.csv. Testing Set: 1,115 images (20% of the dataset) accompanied by test_labels.csv. Potential Use Cases Digitizing handwritten prescriptions is a critical step in modernizing healthcare systems, reducing medication dispensing errors, and automating pharmacy workflows. By providing a clean, pre-split, and challenging benchmark of natural physician handwriting, this dataset enables researchers to directly train, validate, and compare deep learning architectures (such as CRNNs or Vision Transformers) for medical text extraction.
Handwritten Prescription, Text Recognition, Handwriting Recognition
Handwritten Prescription, Text Recognition, Handwriting Recognition
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
