
Detecting and assessing statistical significance of differentially methylated regions (DMRs) is a fundamental task in methylome association studies. While the average differential methylation in different phenotype groups has been the inferential focus, methylation changes in chromosomal regions may also present as differential variability, i.e., variably methylated regions (VMRs). Testing statistical significance of regional differential methylation is a challenging problem, and existing algorithms do not provide accurate type I error control for genome-wide DMR or VMR analysis. No algorithm has been publicly available for detecting VMRs. We propose DMseg, a Python algorithm with efficient DMR/VMR detection and significance assessment for array-based methylome data, and compare its performance to Bumphunter, a popular existing algorithm. Operationally, DMseg searches for DMRs or VMRs within CpG clusters that are adaptively determined by both gap distance and correlation between contiguous CpG sites in a microarray. Levene test was implemented for assessing differential variability of individual CpGs. A likelihood ratio statistic is proposed to test for a constant difference within CpGs in a DMR or VMR to summarize the evidence of regional difference. Using a stratified permutation scheme and pooling null distributions of LRTs from clusters with similar numbers of CpGs, DMseg provides accurate control of the type I error rate. In simulation experiments, DMseg shows superior power than Bumphunter to detect DMRs. Application to methylome data of Barrett's esophagus and esophageal adenocarcinoma reveals a number of DMRs and VMRs of biological interest.
Methodology (stat.ME), FOS: Computer and information sciences, Statistics - Methodology
Methodology (stat.ME), FOS: Computer and information sciences, Statistics - Methodology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
