
The rMD17-aq dataset: Citation: Jas Kalayan, Ismaeel Ramzan, Christopher D. WIlliams, Neil A. Burton and Richard A. Bryce "A neural network potential based on pairwise resolved atomic forces and energies", publication TBC Description: QM/MM aqueous simulations of the 10 molecules from the original MD17 dataset by Chmiela et al. (and revised dataset by Christensen et al.) were performed surrounded by 400 SPC/E water molecules. Each simulation was performed for 100~ps at 500K temperature and 1 atm pressure. The solute conformations sampled from the QM/MM simulations performed with CP2K are used to recalculate forces and energies of each conformation in Gaussian with a denser integral grid to effectively remove numerical noise. We also include an 11th molecule of a higher energy conformer of salicylic acid (directory name: salicylic_high_energy_conformer) in addition to the lower energy conformer sampled in the MD17 dataset. For each molecule (excluding all surrounding water molecules), this dataset contains the nuclear charges, coordinates (Angstrom), forces (kcal/mol/Ang), energies (kcal/mol/Ang) and partial atomic charges (atomic units) in space separated formats outputted from the numpy savetxt function. The data: The files in each molecule directory are: 'nuclear_charges.txt' : The nuclear charges for each atom in a molecule. 'coords.txt' : The Cartesian coordinates for each atom in a conformation (Angstrom units) 'energies.txt' : The total energy of each conformation (kcal/mol units) 'forces.txt' : The Cartesian forces for each atom in a conformation (kcal/mol/Angstrom units) 'charges.txt' : The partial ElectroStatic Potential (ESP) atomic charges (atomic units) 'molecules.prmtop' : The Amber formatted topology file containing the MM parameters for water molecules (solute MM parameters are not used) 'minimised.rst.pdb' : The initial coordinates of a minimised system used to perform QM/MM simulations in CP2K The input data: The input files to perform simulations and single point energy calculations are provided in the '_cp2k_gaussian_example_inputs' directory. These files are: 'cp2k-qmmm-example.inp' : input file for the QM/MM simulations performed with CP2K. The number of QM atom kinds are replaced with placeholders CCC, OOO, HHH, NNN for the number of carbon, oxygen, hydrogen and nitrogen atoms respectively in a solute molecule. The system dimensions placeholder XXYYZZ can be replaced with the BOX_DIMENSIONS in the molecules.prmtop file. 'def2-svp.1.cp2k' : the basis set used in QM/MM simulations 'gaussain_input.com': an example of a Gaussian input file for single point energy calculations for aspirin.
small molecules, machine learning, forces, energies, chemistry
small molecules, machine learning, forces, energies, chemistry
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
