
MD Simulation Datasets for GPCRs This repository contains Molecular Dynamics (MD) simulation data for four G-Protein Coupled Receptors (GPCRs): Dopamine D2 Receptor (D2R) Dopamine D1 Receptor (D1R) Adenosine A2A Receptor (A2AR) Beta-1 Adrenergic Receptor (B1AR) All simulations were prepared and processed for use with the machine learning model described in the paper "Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings." Each directory contains the necessary files to reproduce or analyze the simulation trajectories. Repository Structure The archive is organized into two main directories: 1. MD_simulation_data Contains GROMACS simulation files for each individual run. Within this directory, each replica (e.g., run1/ and run6/) has two subdirectories: input_files/: All starting structures (PDB format), topology files (.top), index files (.ndx), and MD parameter files (.mdp) covering every simulation stage production_run/: Outputs from the production phase, including the GROMACS portable run input file (.tpr) and processed trajectory file (.xtc) toppar/: Custom force field parameters (if applicable) 2. Processed_ML_Input_JSON Houses the principal JSON file used for machine learning input (e.g., final_combined.json or my_protein.json). This file aggregates aligned heavy-atom coordinates and dihedral angles for 12,241 frames sampled from a representative D2R trajectory and served as training data for the LD-FPG generative model. System Details Dopamine D2 Receptor (apo_d2_inv_start) Protein: Human Dopamine D2 Receptor (D2R) System: Apo (ligand-free) receptor in inactive state Starting Structure: Based on PDB ID 6CM4, with third intracellular loop (ICL3) remodeled Force Field: CHARMM36m MD Software: GROMACS version 2024.2 Protocol: Energy minimization → multi-step equilibration → 2-microsecond production run under NPT conditions Replicas: run1 and run6 directories contain independent simulation replicas Dopamine D1 Receptor (apo_d1) Protein: Human Dopamine D1 Receptor (D1R) System: Apo (ligand-free) receptor Starting Structure: TODO: Add PDB ID or reference for the starting model Simulation Details: run1 directory contains primary simulation data following similar protocol Adenosine A2A Receptor (apo_A2AR) Protein: Human Adenosine A2A Receptor (A2AR) System: Apo (ligand-free) receptor Starting Structure: TODO: Add PDB ID or reference for the starting model Simulation Details: run1 directory contains primary simulation data following similar protocol Beta-1 Adrenergic Receptor (apo_beta1) Protein: Human Beta-1 Adrenergic Receptor (B1AR) System: Apo (ligand-free) receptor Starting Structure: TODO: Add PDB ID or reference for the starting model Simulation Details: run1 directory contains primary simulation data following similar protocol File Descriptions Key Files in Each Run Directory input_files/ system_begin.pdb - Complete system starting structure protein_initial.pdb - Protein-only starting structure *.top - Topology file *.ndx - Index file *.mdp - MD parameter files for all simulation stages toppar/ - Custom force field parameters (if deviations from standard CHARMM36m were applied) production_run/ production_run.tpr - GROMACS portable run input file traj_protein_noPBC.xtc - Processed trajectory file (heavy atoms only, periodic boundary conditions removed) step7_noPBC_prot.xtc - Alternative name for processed trajectory Important Notes for Users Full-System Trajectories Full-system production trajectories (including membrane, solvent, and other components, commonly named step7_1.xtc) are excluded from this archive due to large file sizes (approximately 14-15 GB each). Researchers interested in these full trajectories may request them from the corresponding authors. Processing Instructions To process full trajectories yourself, use the provided system_begin.pdb as the reference structure with processing scripts like extract_residues.py To convert the supplied protein-only trajectory (traj_protein_noPBC.xtc) into JSON format required by the ML pipeline, use processing scripts such as extract_residues.py (available on GitHub) with a PDB file containing only protein heavy atoms (heavy_chain.pdb) Simulation Replicates The two replicas (run1 and run6) are included to illustrate simulation variability. Additional replicas can be obtained from the authors upon reasonable request. Data Usage The processed trajectory files retain only heavy atoms of the protein and have had periodic boundary conditions removed, making them suitable for direct analysis or further processing for machine learning applications. Citation and Contact For complete methodological details and further context, please refer to the main publication. For questions regarding full-system trajectories or additional replicas, contact the corresponding authors.
GPCR, Protein Conformation, Computational Biology, GROMACS, Molecular Dynamics
GPCR, Protein Conformation, Computational Biology, GROMACS, Molecular Dynamics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
