
Abstract As an effective programmable DNA targeting tool, CRISPR–Cas9 system has been adopted in varieties of biotechnological applications. However, the off-target effects, derived from the tolerance towards guide-target mismatches, are regarded as the major problems in engineering CRISPR systems. To understand this, we constructed two sgRNA libraries carrying saturated single- and double-nucleotide mismatches in living bacteria cells, and profiled the comprehensive landscape of in vivo binding affinity of dCas9 toward DNA target guided by each individual sgRNA with particular mismatches. We observed a synergistic effect in seed, where combinatorial double mutations caused more severe activity loss compared with the two corresponding single mutations. Moreover, we found that a particular mismatch type, dDrG (D = A, T, G), only showed moderate impairment on binding. To quantitatively understand the causal relationship between mismatch and binding behaviour of dCas9, we further established a biophysical model, and found that the thermodynamic properties of base-pairing coupled with strand invasion process, to a large extent, can account for the observed mismatch-activity landscape. Finally, we repurposed this model, together with a convolutional neural network constructed based on the same mechanism, as a predictive tool to guide the rational design of sgRNA in bacterial CRISPR interference.
Chemical Biology and Nucleic Acid Chemistry, Models, Genetic, Base Pair Mismatch, CRISPR-Associated Protein 9, Escherichia coli, RNA, Thermodynamics, DNA, CRISPR-Cas Systems, Protein Binding
Chemical Biology and Nucleic Acid Chemistry, Models, Genetic, Base Pair Mismatch, CRISPR-Associated Protein 9, Escherichia coli, RNA, Thermodynamics, DNA, CRISPR-Cas Systems, Protein Binding
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 31 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
