
pmid: 33232622
The three-dimensional (3D) organization of the human genome is of crucial importance for gene regulation, and the CCCTC-binding factor (CTCF) plays an important role in chromatin interactions. However, it is still unclear what sequence patterns in addition to CTCF motif pairs determine chromatin loop formation. To discover the underlying sequence patterns, we have developed a deep learning model, called DeepCTCFLoop, to predict whether a chromatin loop can be formed between a pair of convergent or tandem CTCF motifs using only the DNA sequences of the motifs and their flanking regions. Our results suggest that DeepCTCFLoop can accurately distinguish the CTCF motif pairs forming chromatin loops from the ones not forming loops. It significantly outperforms CTCF-MP, a machine learning model based on word2vec and boosted trees, when using DNA sequences only. Furthermore, we show that DNA motifs binding to several transcription factors, including ZNF384, ZNF263, ASCL1, SP1, and ZEB1, may constitute the complex sequence patterns for CTCF-mediated chromatin loop formation. DeepCTCFLoop has also been applied to disease-associated sequence variants to identify candidates that may disrupt chromatin loop formation. Therefore, our results provide useful information for understanding the mechanism of 3D genome organization and may also help annotate and prioritize the noncoding sequence variants associated with human diseases.
CCCTC-Binding Factor, Binding Sites, Computational Biology, DNA, Sequence Analysis, DNA, Chromatin, Cell Line, Deep Learning, Humans, Genetic Predisposition to Disease, Nucleotide Motifs, K562 Cells, HeLa Cells, Transcription Factors
CCCTC-Binding Factor, Binding Sites, Computational Biology, DNA, Sequence Analysis, DNA, Chromatin, Cell Line, Deep Learning, Humans, Genetic Predisposition to Disease, Nucleotide Motifs, K562 Cells, HeLa Cells, Transcription Factors
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 15 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
