Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Publikationsserver d...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://dx.doi.org/10.18154/rw...
Doctoral thesis . 2022
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Computational method for single Cell ATAC-seq imputation and dimensionality reduction

Authors: Li, Zhijian;

Computational method for single Cell ATAC-seq imputation and dimensionality reduction

Abstract

Chromatin accessibility, or the physical access to chromatinized DNA, plays an essential role in controlling the temporal and spatial expression of genes in eukaryotic cells. Assay for transposase- accessible chromatin followed by high throughput sequencing (ATAC-seq) is a sensitive and straight- forward protocol for profiling chromatin accessibility in a genome-wide manner. Moreover, combined with single-cell sequencing technology, the single-cell ATAC-seq (scATAC-seq) is able to map reg- ulatory variation from hundreds to thousands of cells at single-cell resolution, further expanding its applications. However, a major drawback of scATAC-seq data is its inherent sparsity. In other words, many open chromatin regions are not detected due to low input or loss of DNA material in the scATAC-seq experiment, leaving a large number of missing values in the derived count matrix. Such a phenomenon is known as “drop-outs” and is also observed in other single-cell sequencing data, such as scRNA- seq. Although many computational methods have been proposed to address this issue for scRNA-seq based on data imputation or denoising, there is a substantial lack of efforts to assess the usability of these methods on scATAC-seq data. Moreover, the development of specific algorithms for imputing or denoising scATAC-seq is still poorly explored yet.Another critical issue when dealing with the scATAC-seq matrix is the high dimensionality. Be- cause a gene is often regulated by multiple cis-regulatory elements (CREs), the number of features in scATAC-seq (i.e., peaks) is usually one order magnitude higher compared with the number of features in scRNA-seq (i.e., genes). This high dimensionality poses a challenge for the analysis of scATAC-seq, such as clustering and visualization. Therefore, it is a common option to first perform dimensionality reduction prior to interpreting the data. However, the standard computational meth- ods for scRNA-seq data are potentially unsuitable for this task due to the low-count information of scATAC-seq data, i.e., a maximum of 2 digestion events is expected for an individual cell in a specific open chromatin region.In this thesis, we propose scOpen, a computation approach for simultaneous quantification of single-cell open chromatin status and reduction of the dimensionality, to address the aforementioned issues for scATAC-seq data analysis. More formally, scOpen performs imputation and denoising of a scATAC-seq matrix via regularized non-negative matrix factorization (NMF) based on term frequency-inverse document frequency (TF-IDF) transformation. We show that scOpen is able to improve several crucial downstream analysis steps of scATAC-seq data, such as clustering, visualization, cis-regulatory DNA interactions and delineation of regulatory features. Moreover, we also demonstrate its power to dissect chromatin accessibility dynamics on large-scale scATAC-seq data from intact mouse kidney tissue. Finally, we perform additional analyses to investigate the regulatory programs that drive the development of kidney fibrosis. Our analyses shed novel light on mechanisms of myofibroblasts differentiation driving kidney fibrosis and chronic kidney disease (CKD). Altogether, these results demonstrate that scOpen is a useful computational approach in biological studies involving single-cell open chromatin data processing.

Dissertation, RWTH Aachen University, 2022; Aachen : RWTH Aachen University 1 Online-Ressource : Illustrationen, Diagramme (2022). = Dissertation, RWTH Aachen University, 2022

Published by RWTH Aachen University, Aachen

Country
Germany
Related Organizations
Keywords

info:eu-repo/classification/ddc/004

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green