
pmid: 35595245
AbstractMotivationAutomated molecule generation is a crucial step in in-silico drug discovery. Graph-based generation algorithms have seen significant progress over recent years. However, they are often complex to implement, hard to train and can under-perform when generating long-sequence molecules. The development of a simple and powerful alternative can help improve practicality of automated drug discovery method.ResultsWe proposed a ConvNet-based sequential graph generation algorithm. The molecular graph generation problem is reformulated as a sequence of simple classification tasks. At each step, a convolutional neural network operates on a sub-graph that is generated at previous step, and predicts/classifies an atom/bond adding action to populate the input sub-graph. The proposed model is pretrained by learning to sequentially reconstruct existing molecules. The pretrained model is abbreviated as SEEM (structural encoder for engineering molecules). It is then fine-tuned with reinforcement learning to generate molecules with improved properties. The fine-tuned model is named SEED (structural encoder for engineering drug-like-molecules). The proposed models have demonstrated competitive performance comparing to 16 state-of-the-art baselines on three benchmark datasets.Availability and implementationCode is available at https://github.com/yuh8/SEEM and https://github.com/yuh8/SEED. QM9 dataset is availble at http://quantum-machine.org/datasets/, ZINC250k dataset is availble at https://raw.githubusercontent.com/aspuru-guzik-group/chemical_vae/master/models/zinc_properties/250k_rndm_zinc_drugs_clean_3.csv, and ChEMBL dataset is availble at https://www.ebi.ac.uk/chembl/.Supplementary informationSupplementary data are available at Bioinformatics online.
Drug Discovery, Neural Networks, Computer, Algorithms
Drug Discovery, Neural Networks, Computer, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
