Downloads provided by UsageCounts
Force fields are used in a wide variety of contexts for classical molecular simulation, including studies on protein-ligand binding, membrane permeation, and thermophysical property prediction. The quality of these studies relies on the quality of the force fields used to represent the systems. Focusing on small molecules of fewer than 50 heavy atoms, this data compares nine force fields: GAFF, GAFF2, MMFF94, MMFF94S, OPLS3e, SMIRNOFF99Frosst, and the Open Force Field Parsley, versions 1.0, 1.1, and 1.2. On a dataset comprising 22,675 molecular structures of 3,271 molecules, we analyzed force field-optimized geometries and conformer energies compared to reference quantum mechanical (QM) data. The data was created using scripts of the benchmarkff github repository. A corresponding manuscript is submitted, a preprint is available on ChemRxiv: Lim, Victoria T.; Hahn, David F.; Tresadern, Gary; Bayly, Christopher I.; Mobley, David (2020): Benchmark Assessment of Molecular Geometries and Energies from Small Molecule Force Fields. ChemRxiv. Preprint Read below or the file README.md for further information and description of the content: # README Version: 04 Nov 2020 For Python scripts that are NOT found in these directories, please check the [BenchmarkFF Github repo](https://github.com/MobleyLab/benchmarkff/tree/master/tools). ## Procedure 1. Prep OPLS3e file for analysis: standardize format by OpenEye in case of differences and convert from kJ/mol to kcal/mol. ``` cd prep python convert_extension.py -i opls3e_minimized.sd -o opls3e.sdf ``` 2. Remove mols that couldn't parameterize by ALL FFs. ``` python get_by_tag.py -i opls3e.sdf -s "SMILES QCArchive" -list trim3.txt -o trim3_full_opls3e.sdf ``` 3. Run analysis. ``` conda activate parsley # calc ddE, RMSD, and TFD distributions python compare_ffs.py -i match.in -t 'SMILES QCArchive' --plot > metrics.out # match_minima, only in 01_analysis_all and 02_analysis_all_smaller_cutoff python match_minima.py -i match.in --plot --cutoff 1.0 --readpickle # look at specific subsets, only in 01_analysis_all python color_by_moiety.py -i match.in -p metrics.pickle -s N-N.dat azetidine.dat octahydrotetracene.dat -o scatter_tfd_3_ # look at outliers,only in 01_analysis_all and 02_analysis_all_smaller_cutoff python tailed_parameters.py -i refdata_trim_overlap_full_openff_unconstrained-1.2.0.sdf -f <offxml file> --metric 'TFD' --cutoff 0.12 --tag "TFD to trim_overlap_full_qcarchive.sdf" --tag_smiles "SMILES QCArchive" > output_tfd.dat ``` ## Brief description of contents * High level: ``` . ├── 00_prep │ ├── convert_extension.py │ ├── opls3e_minimized.sd OPLS3e minimized structures from Schrodinger Maestro │ ├── opls3e.sdf standardized through OpenEye tools │ ├── opt_openff*.sdf OpenFF minimized conformations ├── 01_analysis_all compare all ffs (qm, GAFF(2), MMFF94(S), Smirnoff, OpenFF-X.X, OPLS3e) ├── 02_analysis_all_smaller_cutoff compare all ffs (qm, GAFF(2), MMFF94(S), Smirnoff, OpenFF-X.X, OPLS3e) with a smaller cutoff of .3 for match_minima ├── 03_analysis_latest_ffs compare only the latest versions of ffs (qm, GAFF2, MMFF94S, OpenFF-1.2, OPLS3e) ├── 04_analysis_openff_only compare only OpenFF ffs (qm, Smirnoff, OpenFF-X.X) └── README.md ``` * Inside an output directory: ``` YY_analysis_* various output files of above mentioned scripts, some are listed and described below: ├── bar*.png parameter coverage bar plots ├── ddE.dat relative energies data ├── fig_density_*.png scatter plots of ddE vs (RMSD or TFD) for each force field ├── match.in input file for compare_ffs.py ├── metrics.out output file for compare_ffs.py ├── metrics.pickle pickle file for compare_ffs.py -- you can read this into compare_ffs instead of rerunning the full analysis ├── refdata_*.sdf output SDF files with stored RMSD / TFD scores with reference to QM for each structure ├── relene_*.dat relative energies of matched conformers ├── ridge_dde.png compared energies plot ├── ridge_rmsd.svg compared rmsds plot ├── ridge_tfd.svg compared tfds plot ├── fig_scatter_*.png scatter plots of ddE vs (RMSD or TFD). these are noisy; I don't use these ├── trim3_*.sdf input SDF files for compare_ffs.py listed in match.in file ├── violin*.* violin plot showing ddE distributions ```
benchmark, force field, conformer energy, molecular energy, molecular geometry, rmsd, tfd, azetidine, octahydrotetracene, nitrogen-nitrogen bond, gaff, gaff2, mmff94, mmff94s, parsley, open force field, OPLS3e, openff, SMIRNOFF99Frosst
benchmark, force field, conformer energy, molecular energy, molecular geometry, rmsd, tfd, azetidine, octahydrotetracene, nitrogen-nitrogen bond, gaff, gaff2, mmff94, mmff94s, parsley, open force field, OPLS3e, openff, SMIRNOFF99Frosst
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 9 | |
| downloads | 4 |

Views provided by UsageCounts
Downloads provided by UsageCounts