
arXiv: 2210.08171
In protein biophysics, the separation between the functionally important residues (forming the active site or binding surface) and those that create the overall structure (the fold) is a well-established and fundamental concept. Identifying and modifying those functional sites is critical for protein engineering but computationally non-trivial, and requires significant domain knowledge. To automate this process from a data-driven perspective, we propose a disentangled Wasserstein autoencoder with an auxiliary classifier, which isolates the function-related patterns from the rest with theoretical guarantees. This enables one-pass protein sequence editing and improves the understanding of the resulting sequences and editing actions involved. To demonstrate its effectiveness, we apply it to T-cell receptors (TCRs), a well-studied structure-function case. We show that our method can be used to alter the function of TCRs without changing the structural backbone, outperforming several competing methods in generation quality and efficiency, and requiring only 10% of the running time needed by baseline models. To our knowledge, this is the first approach that utilizes disentangled representations for TCR engineering.
37th Conference on Neural Information Processing Systems (NeurIPS) 2023, December 10-16, 2023, New Orleans, Louisianna, USA
Series: Advances in Neural Information Processing Systems
T-cells, auto encoders, binding surface, data driven, protein engineering, Biomolecules (q-bio.BM), cell engineering, T cells receptors, active site, functional sites, proteins, learning systems, domain knowledge, Quantitative Biology - Biomolecules, FOS: Biological sciences, cell membranes
T-cells, auto encoders, binding surface, data driven, protein engineering, Biomolecules (q-bio.BM), cell engineering, T cells receptors, active site, functional sites, proteins, learning systems, domain knowledge, Quantitative Biology - Biomolecules, FOS: Biological sciences, cell membranes
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
