
The dataset contains raw and intermediated files, and scripts required to reproduce the results associated with the manuscript "Assigning transcriptomic subtypes to CLL samples using nanopore RNA-sequencing and self-organizing maps". Here, we demonstrate that integrating publicly available short-read data with in-house generated ONT data, along with the application of machine learning approaches, enables the characterization of the CLL transcriptome landscape, the identification of clinically relevant molecular subtypes, and the assignment of these subtypes to nanopore-sequenced samples. ------------------------------------------------------------------------------------------------------------------------------------------- ONT_Projection_paper.zip archive contains scripts and data used to generate the results for the initial submission of the paper. The content of the data archive is following: Scripts Projection_CML_CLL_ONT.Rproj - project workspace and metadata about available files and datasets. CLL_ONT_4_pub.Rmd - R Markdown file with complete analysis workflow. It includes scripts for data conversion and analysis. test_SVM_ONT.r - R script for supervised projection of ONT sequencing data on CLL map SOM landscape and assigning transcriptome subtypes. phenomap.R - R script for generation of phenotype maps. SOM2jpeg.R - R script for saving SOM portrait image. assign_SOM_class.R - R script for assignment transcriptome subtypes to nanopore sequencing samples. Raw Data ONT_exp_matrix_w_samplenames.csv - raw count matrix of nanopore sequencing samples Sample_metadata.csv - nanopore sequencing sample metadata cllmap_rnaseq_tpms_full.csv - tmp value matrix of CLL Map Project [R1] cllmap_participants.csv - metadata of CLL Map Project Intermediate files CLL_MAP_Knisbacher_2022.Rdata - Rdata object with CLL Map tmp matrix and metadata CLL_MAP_Knisbacher_2022_adj.Rdata - Rdata object with CLL Map batch corrected tmp matrix and metadata ont_merged_counts.Rdata - Rdata object with raw count matrix of nanopore sequencing samples bmTable.Rdata - Rdata with ENSEMBL to Gene Official Symbol conversion table CLL-ONT.Rdata - Rdata object with tpm value matrix (Gene Symbols as row names) of nanopore sequencing samples metadata.pred - Folder with supSOM image of ONT samples mean.m.tr.pred - Folder with group SOM images of CLL map transcriptomic subtypes. Result files results.CLLMAPadj_overExp_2 - Results - Folder with the results of CLL map transcriptomic portrayal using oposSOM pipeline [R2]. all_significant_GS.csv - Functional annotation of SOM gene modules (spots). This file contains significant (FDR-adjusted) gene sets. specific_GS.csv - Functional annotation of SOM gene modules (spots). This file contains significant (FDR-adjusted) gene sets specific for a given spot. References R1. CLL-map Portal. https://cllmap.org/. Last accessed December 20, 2024 R2. Löffler-Wirth H, Kalcher M, Binder H. oposSOM: R-package for high-dimensional portraying of genome-wide expression landscapes on bioconductor. Bioinformatics. 2015 Oct 1;31(19):3225-7. doi: 10.1093/bioinformatics/btv342. Epub 2015 Jun 10. PMID: 26063839. --------------------------------------------------------------------------------------------------------------------------------------------- ONT_Projection_paper_revision.zip folder contains additional scripts created in response to the Reviewers' comments during the first round of revisions. CLL_ONT_revision.Rmd - An R Markdown file containing revision-related scripts. hr_table_ffs.csv and hr_table_os.csv- Hazard ratio tables from the multivariable Cox regression model for failure-free survival and overall survival, with PAT types, gender, and spot I as independent variables.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
