Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
Lunaris
Dataset . 2024
License: CC BY
Data sources: Lunaris
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
versions View all 6 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Calculated state-of-the art results for solvation and ionization energies of thousands of organic molecules relevant to battery design

Authors: Weinreich, Jan; Karandashev, Konstantin; Arismendi Arrieta, Daniel Jose; Hermansson, Kersti; von Lilienfeld, Anatole;

Calculated state-of-the art results for solvation and ionization energies of thousands of organic molecules relevant to battery design

Abstract

This dataset presents molecular properties critical for battery electrolyte design, specifically solvation energies, ionization potentials, and electron affinities. The dataset is intended for use in machine learning model testing and algorithm validation. The properties calculated include solvation energies using the COSMO-RS method [1] and ionization potentials and electron affinities using various high-accuracy computational methods as implemented in MOLPRO [2]. Computational details can be found in Ref. [3], with scripts used to generate the data mostly uploaded to our github repository [4]. Molecular Datasets Considered: QM9 Dataset: Contains small organic molecules broadly relevant for quantum chemistry [5] Electrolyte Genome Project (EGP): Focuses on materials relevant to electrolytes.[6] GDB17 and ZINC databases: Offer a broad chemical diversity with potential application in battery technologies. [7, 8] Data structure How to Load the Data: All files can be loaded with import json with open("file.json", "r") as f: data_dict = json.load(f) and the filestructure can be explored with data_dict.keys() Solvation energies The data is stored in two types of JSON archives: files for full molecules of GDB17 and ZINC and files for amons of GDB17 and ZINC. They are structured differently as amon entries are sorted by the number of heavy atoms in the amon (e.g., all amons with 3 heavy atoms are stored in ni3). Because of the large number of amons with 6 or 7 heavy atoms,they are further split into ni6_1, ni6_2, and so on. A sub dictionary of an amon dictionary or a full molecule dictionary contains the following keys: ECFP - ECFP4 representation vector SMILES - SMILES string SYMBOLS - atomic symbols COORDS - atomic positions in Angstrom ATOMIZATION - atomization energy in [kcal/mol] DIPOLE - dipole moment in Debye ENERGY - energy in Hartree SOLVATION - solvation energy in [kcal/mol] for different solvents at 300 K. Files: GDB17.json.zip (unpack with unzip first!) - subset of GDB17 random molecules AMONS_ZINC.json - all amons of ZINC up to 7 heavy atoms EGP.json - EGP molecules AMONS_GDB17.json - all amons of GDB17 up to 7 heavy atoms File Name Description Molecules all_amons_gdb17.json GDB17 amons 40726 all_amons_zinc.json ZINC amons 91876 GDB17.json Subset of GDB17 312793 EGP.json EGP molecules 15569 Atomic energies $E_{at}$ at BP and def2-TZVPD level in Hartree [Ha] Element H C N O F Br Cl S P $E_{at}$ [Ha] -0.5 -37.85 -54.60 -75.09 -99.77 -2574.40 -460.20 -398.16 -341.30| B Si -24.65 -289.40 We follow the convention of negative atomization energies for stablity compared to the isolated atoms: $E_{atomization} = E_{mol} - \sum_{i} E_{at,i}$ Free energy of solvation at 300 K in [kcal/mol]: Ionization potentials and electron affinities The upload contains two JSON files, QM9IPEA.json and QM9IPEA_atom_ens.json. QM9IPEA.json summarizes MOLPRO calculation data grouping it along the following dictionary keys: COORDS - atom coordinates in Angstroms. SYMBOLS - atom element symbols. ENERGY - total energies for each charge (0, -1, 1) and method considered. CPU_TIME - CPU times (in seconds) spent at each step of each part of the calculation. DISK_USAGE - highest total disk usage in GB. ATOMIZATION_ENERGY - atomization energy at charge 0. QM9_ID - ID of the molecule in the QM9 dataset. All energies are given in Hartrees with NaN indicating the calculation failed to converge. Ionization potentials and electron affinities can be recovered as energy differences between neutral and charged (+1 for ionization potentials, -1 for electron affinities) species. "CPU_time" entries contain steps corresponding to individual method calculations, as well as steps corresponding to program operation: "INT" (calculating integrals over basis functions relevant for the calculation), "FILE" (dumping intermediate data to restart file), and "RESTART" (importing restart data). The latter two steps appeared since we reused relevant integrals calculated for neutral species in charged species' calculations; we also used restart functionality to use HF density matrix obtained for the neutral species as the initial density matrix guess for the SCF-HF calculation for charged species. NaN CPU time value means the step was not present or that the calculation is invalid. Note that the CPU times were measured while parallelizing on 12 cores and were not adjusted to single-core. QM9IPEA_atom_ens.json contains atomic energies used to calculate atomization energies in QM9IPEA.json, the dictionary keys are: SPINS - the spin assigned to elements during calculations of atomic energies. ENERGY - energies of atoms using different methods. (Note that H has only one electron and thus does not require a level of theory beyond Hartree-Fock.) NOTE: Additional calculations were performed between publication of arXiv:2308.11196 and creation of this upload. For the version of the dataset used in the manuscript, please refer to DOI:10.5281/zenodo.8252498. Acknowledgement This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957189 (BIG-MAP) and No. 957213 (BATTERY 2030+). O.A.v.L. has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772834). O.A.v.L. has received support as the Ed Clark Chair of Advanced Materials and as a Canada CIFAR AI Chair. O.A.v.L. acknowledges that this research is part of the University of Toronto’s Acceleration Consortium, which receives funding from the Canada First Research Excellence Fund (CFREF). Obtaining the presented computational results has been facilitated using the queueing system implemented at https://leruli.com. The project has been supported by the Swedish Research Council (Vetenskapsrådet), and the Swedish National Strategic e-Science program eSSENCE as well as by computing resources from the Swedish National Infrastructure for Computing (SNIC/NAISS). References [1] Klamt, A.; Eckert, F. COSMO-RS: a novel and efficient method for the a priori prediction of thermophysical data of liquids. Fluid Phase Equilibria 2000, 172, 43–72 [2] Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Schutz, M. Molpro: a general-purpose quantum chemistry program package. WIREs Comput. Mol. Sci. 2012, 2, 242–253 [3] arxiv link of draft [4] https://github.com/chemspacelab/ViennaUppDa [5] Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 2014, 1, 140022 [6] Qu, X.; Jain, A.; Rajput, N. N.; Cheng, L.; Zhang, Y.; Ong, S. P.; Brafman, M.; Mag- inn, E.; Curtiss, L. A.; Persson, K. A. The Electrolyte Genome Project: A big data approach in battery materials discovery. Comput. Mater. Sci. 2015, 103, 56–67 [7] Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J.-L. Enu- meration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17. Journal of Chemical Information and Modeling 2012, 52, 2864–2875 [8] Irwin, J. J.; Shoichet, B. K. ZINC A Free Database of Commercially Available Compounds for Virtual Screening. Journal of Chemical Information and Modeling 2005, 45, 177–182.

Country
Canada
  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities