
Abstract Motivation Identifying the genes regulated by a given transcription factor (TF) (its ‘target genes’) is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF’s binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. Results We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF’s binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene’s promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. Availability and implementation The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. Supplementary information Supplementary data are available at Bioinformatics online.
Statistics and Probability, 570, Chromatin Immunoprecipitation, 1303 Biochemistry, Binding Sites, Biochemistry, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Gene Expression Regulation, 1312 Molecular Biology, 1706 Computer Science Applications, 2613 Statistics and Probability, Molecular Biology, 2605 Computational Mathematics, Algorithms, Software, 1703 Computational Theory and Mathematics, Transcription Factors
Statistics and Probability, 570, Chromatin Immunoprecipitation, 1303 Biochemistry, Binding Sites, Biochemistry, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Gene Expression Regulation, 1312 Molecular Biology, 1706 Computer Science Applications, 2613 Statistics and Probability, Molecular Biology, 2605 Computational Mathematics, Algorithms, Software, 1703 Computational Theory and Mathematics, Transcription Factors
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 22 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
