
Genes that are more closely spaced on the chromosome than expected by chance are said to be spatially clustered. Standard tests of clustering versus uniformity do not take into account two important features of genes-the high variability of gene length and the low probability that gene locations overlap (exclusion). We show by simulation that the standard null distributions which ignore length and exclusion do not appropriately approximate the true null distributions of standard tests such as the chi-squared test. We therefore recommend bootstrap sampling to estimate the null distributions. Simulations demonstrate that the chi-squared goodness-of-fit test is a more powerful test of clustering than two other commonly used tests-Kolmogorov and Cramer-von Mises-when the distribution of gene lengths and locations is modeled by a mixture of exponentials and there is a single cluster. The chi-squared test requires binning the gene locations-the number of genes in the bin can be compared to the expected maximum number under random distribution to determine the location of gene clusters and gene deserts. The bootstrap method to test clustering is illustrated using data from human chromosome 22.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
