Downloads provided by UsageCounts
Global loss of DNA methylation in mammalian genomes occurs cumulatively as a mitotic process during aging and cancer, primarily in Partially Methylated Domains (PMDs). It has been shown that local sequence context (100bp) has a strong effect on the rate of demethylation of individual CpG dinucleotides within PMDs. Here, we train a deep learning model to characterize this sequence dependence further, finding that methylation loss can be predicted from a CpG’s 150bp sequence context alone with an AUC of 0.95. We use re-methylation rates of newly synthesized DNA to show that CpGs with fast-loss sequence context are inefficiently re-methylated. Interestingly, we find that the 10% of CpGs predicted to have the “slowest” rate of loss lose almost no DNA methylation in healthy cell types. These same slow-loss CpGs lose a significant amount of DNA methylation in cancer, suggesting that they could be responsible for deregulation of genes and transposable elements that are associated with DNA hypomethylation in cancer. This directory contains the Nov. 18, 2020 version of the human (hg19) CpG hypomethylation Neural network scores in a single tab-delimited (bedgraph) file: multitissue-nn-scores.allCGs.0based.hg19.bedgraph.gz with the following columns: 1: chromosome (hg19) 2: start coord (hg19, 0-based) 3: end coord (hg19, 0-based) 4: multi-tissue NN score (0-1). Close to 0 is classified as slow-loss CpG, close to 1 is classified as fast loss CpG5: Num CpGs in 150 bp window (including central CpG, so minimum is 1). The full version of the NN scores with additional details are in the file zhou-bian.allCGs.1based.hg19.tsv.gz Each row is a CG which provides (1) chromosome, (2) the corresponding C coordinate on the forward (watson) strand of the reference genome in one-based coordinates, (3) Neural network score, (4) number of CpGs within the 150bp sequence centered on this CpG, including the center CpG, (5) CpG is within a CpG island (0, no; 1, yes), CpG is within ENCODE blacklist (0, no; 1, yes) Here the CpG islands are the union set of Irizarry (Irizarry et al. 2009, Nat Genet), Takai-Jones (Takai et al. 2002, PNAS), Gardner-Gardin CGIs (Gardner-Gardin et al. 1987, J Mol Biol.). The blacklist was downloaded from https://github.com/Boyle-Lab/Blacklist/tree/master/lists. Additional files are included here: zhou_pmds.0based.hg19.bed.gz: Input PMD CpGs from the Zhou (multi-tissue) dataset bian_pmds.crc01.0based.hg19.bed.gz: Input PMD CpGs from the Bian (intra-tumor) dataset zhou_bian_train_test_data.tar.gz: All training and test CpGs, including labels and sequence windows.
DNA methylation, Neural network, machine learning, deep learning, hypomethylation, cancer, aging
DNA methylation, Neural network, machine learning, deep learning, hypomethylation, cancer, aging
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 21 | |
| downloads | 14 |

Views provided by UsageCounts
Downloads provided by UsageCounts