
BackgroundMost RNA-seq datasets harbor genes with extreme expression levels in some samples. Such extreme outliers are usually treated as technical errors and are removed from the data before further statistical analysis. Here we focus on the patterns of such outlier expression to investigate whether they provide insights into the underlying biology. ResultsOur study is based on multiple datasets, including data from outbred and inbred mice, GTEx data from human, data from different Drosophila species and single-nuclei sequencing data from human brain tissues. All show comparable general patterns of outlier expression, indicating this as a generalizable biological effect. Different individuals harbor very different numbers of outlier genes, with some individuals showing extreme numbers in only one out of several organs. Outlier gene expression occurs as part of co-regulatory modules, some of which correspond to known pathways. In a three-generation family analysis in mice we find that most extreme over-expression is not inherited, but appears to be sporadically generated. Genes encoding prolactin and growth hormone are also among the co-regulated genes with extreme outlier expression, both in mice and humans, for which we include also a longitudinal expression analysis for protein data. ConclusionsWe show that outlier patterns of gene expression are a biological reality occurring universally across tissues and species. Most of the outlier expression is spontaneous and not inherited. We discuss the interpretation that the outlier patterns reflect edge of chaos effects that are expected for systems of non-linear interactions and feedback loops, such as gene regulatory networks.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
