publication . Article . Preprint . Other literature type . 2017

ChemTS: an efficient python library for de novo molecular generation.

Yang, Xiufeng; Zhang, Jinzhe; Yoshizoe, Kazuki; Terayama, Kei; Tsuda, Koji;
Open Access English
  • Published: 24 Nov 2017 Journal: Science and Technology of Advanced Materials (issn: 1468-6996, eissn: 1878-5514, Copyright policy)
  • Publisher: Taylor & Francis Group
Abstract
Abstract Automatic design of organic materials requires black-box optimization in a vast chemical space. In conventional molecular design algorithms, a molecule is built as a combination of predetermined fragments. Recently, deep neural network models such as variational autoencoders and recurrent neural networks (RNNs) are shown to be effective in de novo design of molecules without any predetermined fragments. This paper presents a novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN. In a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability, our algorithm showed...
Subjects
free text keywords: Molecular design, Monte Carlo tree search, recurrent neural network, python library, Materials of engineering and construction. Mechanics of materials, TA401-492, Biotechnology, TP248.13-248.65, Physics - Chemical Physics, Computer Science - Computational Engineering, Finance, and Science, Article, New topics/Others, 60 New topics/Others, 404 Materials informatics / Genomics, General Materials Science
22 references, page 1 of 2

[1] Niu G, Guo X, Wang L. Review of recent progress in chemical stability of perovskite solar cells. J Mater Chem A. 2015;3(17):8970{8980.

[2] Kaji H, Suzuki H, Fukushima T, et al. Purely organic electroluminescent material realizing 100% conversion from electricity to light. Nat Commun. 2015;6:8476.

[3] Ueda A, Yamada S, Isono T, et al. Hydrogen-bond-dynamics-based switching of conductivity and magnetism: A phase transition caused by deuterium and electron transfer in a hydrogen-bonded purely organic conductor crystal. J Am Chem Soc. 2014;136(34):12184{ 12192.

[4] Yeung MCL, Yam VWW. Luminescent cation sensors: from host{guest chemistry, supramolecular chemistry to reaction-based mechanisms. Chem Soc Rev. 2015; 44(13):4192{4202.

[5] Horiuchi S, Tokura Y. Organic ferroelectrics. Nat Mater. 2008;7(5):357.

[6] Podlewska S, Czarnecki WM, Kafel R, et al. Creating the new from the old: Combinatorial libraries generation with machine-learning-based compound structure optimization. J Chem Inf Model. 2017;57(2):133{147. [OpenAIRE]

[7] Ikebata H, Hongo K, Isomura T, et al. Bayesian molecular design with a chemical language model. J Comput Aided Mol Des. 2017;31(4):379|391. [OpenAIRE]

[8] Weininger D. SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci. 1988;28(1):31{36. [OpenAIRE]

[9] Bowman SR, Vilnis L, Vinyals O, et al. Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016; 2016. p. 10{21.

[10] Oord Avd, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of 33rd International Conference on Machine Learning, ICML 2016; 2016. p. 1747{1756.

[11] Gomez-Bombarelli R, Duvenaud D, Hernandez-Lobato JM, et al. Automatic chemical design using a data-driven continuous representation of molecules. arXiv preprint arXiv:161002415. 2016;.

[12] Kusner MJ, Paige B, Hernandez-Lobato JM. Grammar variational autoencoder. In: Proceedings of 34th International Conference on Machine Learning, ICML 2017; 2017. p. 1945{1954.

[13] Segler MH, Kogej T, Tyrchan C, et al. Generating focussed molecule libraries for drug discovery with recurrent neural networks. arXiv preprint arXiv:170101329. 2017;. [OpenAIRE]

[14] Cho K, van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014; 2014. p. 1724{1734.

[15] Browne C, Powley E, Whitehouse D, et al. A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI in Games. 2012;4(1):1{43. [OpenAIRE]

22 references, page 1 of 2
Abstract
Abstract Automatic design of organic materials requires black-box optimization in a vast chemical space. In conventional molecular design algorithms, a molecule is built as a combination of predetermined fragments. Recently, deep neural network models such as variational autoencoders and recurrent neural networks (RNNs) are shown to be effective in de novo design of molecules without any predetermined fragments. This paper presents a novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN. In a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability, our algorithm showed...
Subjects
free text keywords: Molecular design, Monte Carlo tree search, recurrent neural network, python library, Materials of engineering and construction. Mechanics of materials, TA401-492, Biotechnology, TP248.13-248.65, Physics - Chemical Physics, Computer Science - Computational Engineering, Finance, and Science, Article, New topics/Others, 60 New topics/Others, 404 Materials informatics / Genomics, General Materials Science
22 references, page 1 of 2

[1] Niu G, Guo X, Wang L. Review of recent progress in chemical stability of perovskite solar cells. J Mater Chem A. 2015;3(17):8970{8980.

[2] Kaji H, Suzuki H, Fukushima T, et al. Purely organic electroluminescent material realizing 100% conversion from electricity to light. Nat Commun. 2015;6:8476.

[3] Ueda A, Yamada S, Isono T, et al. Hydrogen-bond-dynamics-based switching of conductivity and magnetism: A phase transition caused by deuterium and electron transfer in a hydrogen-bonded purely organic conductor crystal. J Am Chem Soc. 2014;136(34):12184{ 12192.

[4] Yeung MCL, Yam VWW. Luminescent cation sensors: from host{guest chemistry, supramolecular chemistry to reaction-based mechanisms. Chem Soc Rev. 2015; 44(13):4192{4202.

[5] Horiuchi S, Tokura Y. Organic ferroelectrics. Nat Mater. 2008;7(5):357.

[6] Podlewska S, Czarnecki WM, Kafel R, et al. Creating the new from the old: Combinatorial libraries generation with machine-learning-based compound structure optimization. J Chem Inf Model. 2017;57(2):133{147. [OpenAIRE]

[7] Ikebata H, Hongo K, Isomura T, et al. Bayesian molecular design with a chemical language model. J Comput Aided Mol Des. 2017;31(4):379|391. [OpenAIRE]

[8] Weininger D. SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci. 1988;28(1):31{36. [OpenAIRE]

[9] Bowman SR, Vilnis L, Vinyals O, et al. Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016; 2016. p. 10{21.

[10] Oord Avd, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of 33rd International Conference on Machine Learning, ICML 2016; 2016. p. 1747{1756.

[11] Gomez-Bombarelli R, Duvenaud D, Hernandez-Lobato JM, et al. Automatic chemical design using a data-driven continuous representation of molecules. arXiv preprint arXiv:161002415. 2016;.

[12] Kusner MJ, Paige B, Hernandez-Lobato JM. Grammar variational autoencoder. In: Proceedings of 34th International Conference on Machine Learning, ICML 2017; 2017. p. 1945{1954.

[13] Segler MH, Kogej T, Tyrchan C, et al. Generating focussed molecule libraries for drug discovery with recurrent neural networks. arXiv preprint arXiv:170101329. 2017;. [OpenAIRE]

[14] Cho K, van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014; 2014. p. 1724{1734.

[15] Browne C, Powley E, Whitehouse D, et al. A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI in Games. 2012;4(1):1{43. [OpenAIRE]

22 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue