Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

DECIMER Image classifier dataset

Authors: Agea, M. Isabel;

DECIMER Image classifier dataset

Abstract

{"references": ["Brinkhaus, H.O., Rajan, K., Zielesny, A. et al. RanDepict: Random chemical structure depiction generator. J Cheminform 14, 31 (2022). https://doi.org/10.1186/s13321-022-00609-4", "Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibri\u00e1n-Uhalte E, Davies M, Dedman N, Karlsson A, Magari\u00f1os MP, Overington JP, Papadatos G, Smit I, Leach AR. (2017) 'The ChEMBL database in 2017.' Nucleic Acids Res., 45(D1) D945-D954", "Lin, Tsung-Yi et al. (2014). Microsoft COCO: Common Objects in Context. https://arxiv.org/abs/1405.0312", "B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning Deep Features for Scene Recognition using Places Database. Advances in Neural Information Processing Systems 27 (NIPS), 2014.", "Krishna, Ranhay et al. Visual Genome. Connecting Language and Vision Using Crowdsourced Dense Image Annotations. http://visualgenome.org/static/paper/Visual_Genome.pdf", "https://storage.googleapis.com/openimages/web/index.html", "T. Nasir, M. K. Malik and K. Shahzad, \"MMU-OCR-21: Towards End-to-End Urdu Text Recognition Using Deep Learning,\" in IEEE Access, doi: 10.1109/ACCESS.2021.3110787", "https://www.kaggle.com/datasets/vaibhao/handwritten-characters", "https://www.kaggle.com/datasets/praveengovi/coronahack-chest-xraydataset", "https://www.kaggle.com/datasets/amyjang/pandatilesagg?select=all_images", "https://www.kaggle.com/datasets/nilay1987/bacterial-colony", "https://www.kaggle.com/datasets/pabasar/ceylon-epigraphy-periods", "https://www.kaggle.com/datasets/yuanhaowang486/chinese-calligraphy-styles-by-calligraphers", "https://www.kaggle.com/datasets/sunedition/graphs-dataset", "https://www.kaggle.com/datasets/kopfgeldjaeger/function-graphs-polynomial", "https://www.kaggle.com/datasets/vishnunkumar/sketches", "https://www.kaggle.com/datasets/almightyj/person-face-sketches", "https://www.kaggle.com/datasets/olgabelitskaya/art-pictogram", "https://www.kaggle.com/datasets/tatianasnwrt/russian-handwritten-letters", "https://www.kaggle.com/datasets/olgabelitskaya/handwritten-russian-letters", "https://www.kaggle.com/datasets/arashnic/misinfo-graph", "https://www.kaggle.com/datasets/roycezjq/graphemeimgs224x224"]}

Images dataset divided into train (10905114 images), validation (2115528 images) and test (544946 images) folders containing a balanced number of images for two classes (chemical structures and non-chemical structures). The chemical structures were generated using RanDepict to random picked compounds from the ChEMBL30 database and the COCONUT database. The non-chemical structures were generated using Python or they were retrieved from several public datasets: COCO dataset, MIT Places-205 dataset, Visual Genome dataset, Google Open labeled Images, MMU-OCR-21 (kaggle), HandWritten_Character (kaggle), CoronaHack -Chest X-Ray-dataset (kaggle), PANDAS Augmented Images (kaggle), Bacterial_Colony (kaggle), Ceylon Epigraphy Periods (kaggle), Chinese Calligraphy Styles by Calligraphers (kaggle), Graphs Dataset (kaggle), Function_Graphs Polynomial (kaggle), sketches (kaggle), Person Face Sketches (kaggle), Art Pictograms (kaggle), Russian handwritten letters (kaggle), Handwritten Russian Letters (kaggle), Covid-19 Misinformation Tweets Labeled Dataset (kaggle) and grapheme-imgs-224x224 (kaggle). This data was used to build a CNN classification model using as a base model EfficienNetB0 and fine tuning it. The model is available on Github.

Related Organizations
Keywords

chemical structures, classification model

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 73
    download downloads 17
  • 73
    views
    17
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
73
17