Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes

Other literature type, Article English OPEN
Kreiman, Gabriel (2004)

Sequence information and high‐throughput methods to measure gene expression levels open the door to explore transcriptional regulation using computational tools. Combinatorial regulation and sparseness of regulatory elements throughout the genome allow organisms to control the spatial and temporal patterns of gene expression. Here we study the organization of cis‐regulatory elements in sets of co‐regulated genes. We build an algorithm to search for combinations of transcription factor binding sites that are enriched in a set of potentially co‐regulated genes with respect to the whole genome. No knowledge is assumed about involvement of specific sets of transcription factors. Instead, the search is exhaustively conducted over combinations of up to four binding sites obtained from databases or motif search algorithms. We evaluate the performance on random sets of genes as a negative control and on three biologically validated sets of co‐regulated genes in yeasts, flies and humans. We show that we can detect DNA regions that play a role in the control of transcription. These results shed light on the structure of transcription regulatory regions in eukaryotes and can be directly applied to clusters of co‐expressed genes obtained in gene expression studies. Supplementary information is available at∼kreiman/resources/cisregul/.
