Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao ZENODOarrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
ZENODO
Dataset . 2026
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

An enhanced two-pass PCA workflow applied to rare disease RNA-Seq data reveals hidden structure and biologically relevant variation

Authors: García-Criado, Federico; Seoane Zonjic, Pedro;

An enhanced two-pass PCA workflow applied to rare disease RNA-Seq data reveals hidden structure and biologically relevant variation

Abstract

Below, this repository stores DESeq2-normalized gene expression count tables derived exclusively from RNA-seq data. The repository contains three independent datasets, corresponding to the three diseases studied. Each dataset is described below, focusing only on the RNA-seq count information. sys_normalized_counts_DESeq2.txt This file contains DESeq2-normalized RNA-seq gene expression counts obtained from human brain organoids used to study Schaaf–Yang syndrome (SYS), caused by truncating mutations in the MAGEL2 gene. Rows correspond to genes annotated using Ensembl gene identifiers (e.g. ENSG00000227232), and columns correspond to individual RNA-seq samples. Each column name encodes the experimental metadata using the following structure: S__ where: individual identifies the donor (S_135 corresponds to the control individual and S_66 to the SYS patient), replicate indicates the organoid batch, time indicates the culture time point (30d, 60d, or 90d), organoid_type indicates the organoid subtype, either human cortical spheroids (hCS) or human subpallial spheroids (hSS). Sample labels are: S_135_5_30d_hCS, S_135_5_30d_hSS, S_135_5_60d_hCS, S_135_5_60d_hSS, S_135_5_90d_hCS, S_135_5_90d_hSS, S_66_1_30d_hCS, S_66_1_30d_hSS, S_66_1_60d_hCS, S_66_1_60d_hSS, S_66_1_90d_hCS, S_66_1_90d_hSS. The values in the table represent normalized gene expression counts produced by DESeq2 from paired-end RNA-seq data (150 bp reads, NovaSeq platform), with an average sequencing depth of approximately 100 million reads per sample. lafora_normalized_counts_DESeq2.txt This file contains DESeq2-normalized RNA-seq gene expression counts obtained from a mouse model of Lafora disease, a neurodegenerative disorder caused by mutations in the EPM2A or EPM2B genes. Rows correspond to genes annotated using Ensembl mouse gene identifiers (e.g. ENSMUSG00000051951), and columns correspond to individual RNA-seq samples. Each column name encodes the experimental group and biological replicate using the following structure: CTL_epm2a_epm2b_ where: CTL indicates wild-type control mice, epm2a indicates Epm2a knockout mice (Epm2a-/-), epm2b indicates Epm2b knockout mice (Epm2b-/-), replicate indicates the individual biological replicate. Sample labels are: CTL_1, CTL_2, CTL_3, CTL_4, epm2a_1, epm2a_2, epm2a_3, epm2b_1, epm2b_2, epm2b_3, epm2b_4. The dataset includes RNA-seq data from four wild-type control mice and seven knockout mutant mice. The values in the table represent normalized gene expression counts produced by DESeq2 from RNA-seq data, and were used for differential expression analysis comparing control and mutant animals. pmm2_normalized_counts_DESeq2.txt This file contains DESeq2-normalized RNA-seq gene expression counts obtained from skin fibroblast cell lines of patients affected by PMM2 congenital disorder of glycosylation (PMM2-CDG), caused by loss-of-function mutations in the PMM2 gene. Rows correspond to genes annotated using Ensembl human gene identifiers (e.g. ENSG00000279457), and columns correspond to individual RNA-seq samples. Each column name encodes the patient identifier, disease severity group, and biological replicate using the following structure: P where: patient identifies the individual patient, severity indicates disease severity classified as HIGH or LOW, replicate indicates the biological replicate. The sample labels are: P10_HIGH_4, P11_HIGH_5, P12_HIGH_6, P8_HIGH_2, P9_HIGH_3, P1_LOW_1, P2_LOW_2, P3_LOW_3, P4_LOW_4, P5_LOW_5. The dataset includes RNA-seq data from 10 PMM2-CDG patient-derived fibroblast cell lines, with five samples classified as high-severity and five as low-severity. The values in the table represent normalized gene expression counts produced by DESeq2 from RNA-seq data and were used for differential expression analysis comparing low-severity (control) and high-severity (treat) patient groups.

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average