Downloads provided by UsageCounts
Dataset compiled and curated for use in the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121: Dataset for training models for prediction of thermodynamic stability changes (ddG) of protein point mutations given a wildtype protein structure (PDB) file. Data was assembled by matching sequence-based ddG measurements in FireProtDB to structures from the RCSB Protein Data Bank (PDB). For details, see the Methods section of our manuscript. Citing this work: If you choose to use this dataset for your own research, please cite this repository and the ThermoMPNN paper: https://doi.org/10.1073/pnas.2314853121. Contents: pdbs/ directory contains all PDB files csvs/ directory contains all CSVs with mutation data csvs/4_fireprotDB_bestpH.csv is the main (full) dataset file with 3,438 mutations across 100 proteins. csvs/fireprot_splits.pkl contains the dataset splits (train/val/test) used in our study csvs/splits/ contains csvs for each of the splits (train/val/test/homologue-free) indexed from the full dataset csv. Important CSV columns: pdb_id_corrected: corresponds to the PDB in the pdbs/ directory (after curation and disambiguation) ddG: ddG value for mutation (mutant - WT) wild_type: wild-type amino acid (1-letter code) mutation: mutant amino acid (1-letter code) pdb_position: 0-based index of the mutated residue in the PDB file (may be different from position in the original FireProtDB sequence entry)
Protein Design, Machine Learning, Protein Stability, Structural Biology, Protein Thermodynamic Stability
Protein Design, Machine Learning, Protein Stability, Structural Biology, Protein Thermodynamic Stability
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 113 | |
| downloads | 19 |

Views provided by UsageCounts
Downloads provided by UsageCounts