
Classification of data is an important step in the knowledge evolution of sciences. Traditionally, in sciences, classification of data was performed by human experts. Human knowledge can recognize unique functional properties that are necessary and sufficient to place complex structures and phenomena into a particular class or group. However, with the growth in scientific data and rapid changes in knowledge, it is no longer feasible for humans to classify objects. Automation of the classification process is necessary to cope with the growing amount of data. Otherwise, classification will become the rate-limiting step for scientific data analysis.In this paper, we address the needs of such automation in the SciAEther project and develop ChES, a fast and reproducible framework for classifying molecules in chemical data. Our framework captures human understanding through an ontology and the diversity in classification types through a rule based system to classify complex molecular compounds. We have tested our system with molecules from PubChem repository and found that our knowledge-based, automatic classification matches, and sometimes surpasses, that of the human experts.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
