
doi: 10.1002/prot.26263
pmid: 34651333
Abstract Given a target protein structure, the prime objective of protein design is to find amino acid sequences that will fold/acquire to the given three‐dimensional structure. The protein design problem belongs to the non‐deterministic polynomial‐time‐hard class as sequence search space increases exponentially with protein length. To ensure better search space exploration and faster convergence, we propose a protein modularity–based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU). Here, we have incorporated a divide‐and‐conquer approach where a protein is split into PUs and each PU region is explored in a parallel fashion. It has been further analyzed that our shared memory implementation of modularity‐based parallel sequence search leads to better search space exploration compared to the case of traditional full protein design. Sequence‐based analysis on design sequences depicts an average of 39.7% sequence similarity on the benchmark data set. Structure‐based comparison of the modeled structures of the design protein with the target structure exhibited an average root‐mean‐square deviation of 1.17 Å and an average template modeling score of 0.89. The selected modeled structures of the design protein sequences are validated using 100 ns molecular dynamics simulations where 80% of the proteins have shown better or similar stability to the respective target proteins. Our study informs that our modularity‐based protein design algorithm can be extended to protein interaction design as well.
Structure-Activity Relationship, Protein Conformation, Computational Biology, Proteins, Amino Acid Sequence, Molecular Dynamics Simulation, Databases, Protein, Algorithms
Structure-Activity Relationship, Protein Conformation, Computational Biology, Proteins, Amino Acid Sequence, Molecular Dynamics Simulation, Databases, Protein, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
