The demarcation of transcription factor binding sites through the analysis of DNase-seq data

Doctoral thesis English OPEN
Piper, Jason
  • Subject: QR
    mesheuropmc: genetic processes

The expression of eukaryotic genes is controlled by non-coding regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins (transcription factors). In multicellular organisms, the characterisation of these elements is required in order to understand how a single genome is utilised to generate a multitude of cell types, and how aberrant regulation of transcription contributes to disease processes. This involves the identification of transcription factor binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase I and the subsequent analysis of regions protected from digestion followed by high-throughput sequencing (DNase-seq footprinting), allows for the quantification of genome-wide transcription factor binding.\ud However, the handful of methods for analysing DNase-seq data has not been extensively validated or benchmarked. This thesis describes a novel footprinting algorithm, Wellington, which is presented in the context of a comprehensive comparison of several other DNase-seq footprinting algorithms on a multitude of datasets. \ud Wellington outperforms other methods in almost all situations. An open-source software package, pyDNase, that facilitates interacting with DNase-seq data and provides many tools for DNase-seq analysis is also presented. Wellington is used to perform footprinting on clinical samples to validate cell lines as a model system, and to identify the binding partners of the RUNX1/ETO fusion protein in t(8;21) AML. By expanding the Wellington method, differential footprinting is shown to be able to link differences in transcription factor binding at promoters to changes in gene expression. Applying this methodology to a range of haematopoietic cell types illustrates the ability for differential footprinting to identify key regulators in the haematopoietic lineage. These results represent advances in the methods\ud available to analyse DNase-seq data (all of which have been released as free, opensource\ud software) and demonstrate the power of integrating DNase-seq footprinting with other functional genomic assays to study transcriptional regulation.\ud
  • References (32)
    32 references, page 1 of 4

    [61] Jay R Hesselberth, Xiaoyu Chen, Zhihong Zhang, Peter J Sabo, Richard Sandstrom, Alex P Reynolds, Robert E Thurman, Shane Neph, Michael S Kuehn, William S Noble, Stanley Fields, and John A Stamatoyannopoulos. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nature Methods, 6(4):283{289, April 2009.

    [62] Weihua Zeng and Ali Mortazavi. Technical considerations for functional sequencing assays. Nature immunology, 13(9):802{807, September 2012.

    [63] Peter J Sabo, Michael S Kuehn, Robert Thurman, Brett E Johnson, Ericka M Johnson, Hua Cao, Man Yu, Elizabeth Rosenzweig, Jeff Goldy, Andrew Haydock, Molly Weaver, Anthony Shafer, Kristin Lee, Fidencio Neri, Richard Humbert, Michael A Singer, Todd A Richmond, Michael O Dorschner, Michael McArthur, Michael Hawrylycz, Roland D Green, Patrick A Navas, William S Noble, and John A Stamatoyannopoulos. Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays. Nature Methods, 3(7):511{518, July 2006.

    [64] Jeff Vierstra, Hao Wang, Sam John, Richard Sandstrom, and John A Stamatoyannopoulos. Coupling transcription factor occupancy to nucleosome architecture with DNase-FLASH. Nature Methods, 11(1):66{72, November 2013.

    [65] Teemu D Laajala, Sunil Raghav, Soile Tuomela, Riitta Lahesmaa, Tero Aittokallio, and Laura L Elo. A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments. BMC genomics, 10:618, 2009.

    [66] Shane Neph, Jeff Vierstra, Andrew B Stergachis, Alex P Reynolds, Eric Haugen, Benjamin Vernot, Robert E Thurman, Sam John, Richard Sandstrom, Audra K Johnson, Matthew T Maurano, Richard Humbert, Eric Rynes, Hao Wang, Shinny Vong, Kristen Lee, Daniel Bates, Morgan Diegel, Vaughn Roach, Douglas Dunn, Jun Neri, Anthony Schafer, R Scott Hansen, Tanya Kutyavin, Erika Giste, Molly Weaver, Theresa Can eld, Peter Sabo, Miaohua Zhang, Gayathri Balasundaram, Rachel Byron, Michael J MacCoss, Joshua M Akey, M A Bender, Mark Groudine, Rajinder Kaul, and John A Stamatoy-

    [67] Roger Pique-Regi, Jacob F Degner, Athma A Pai, Daniel J Gaffney, Yoav Gilad, and Jonathan K Pritchard. Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data. Genome Research, 21(3):447{455, March 2011.

    [68] Alan P Boyle, Lingyun Song, Bum-Kyu Lee, Darin London, Damian Keefe, Ewan Birney, Vishwanath R Iyer, Gregory E Crawford, and Terrence S Furey. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Research, 21(3):456{464, March 2011.

    [69] Dan Graur, Yichen Zheng, Nicholas Price, Ricardo B R Azevedo, Rebecca A Zufall, and Eran Elhaik. On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome biology and evolution, 5(3):578{590, 2013.

    [70] G D Stormo, T D Schneider, L Gold, and A Ehrenfeucht. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic acids research, 10(9):2997{3011, May 1982.

  • Related Research Results (1)
  • Metrics
    views in OpenAIRE
    views in local repository
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    Warwick Research Archives Portal Repository - IRUS-UK 0 73
Share - Bookmark