
The identification of cell types in complex tissues is an important step in research into cellular heterogeneity in disease. We present a linear fast semi-supervised clustering (LFSC) algorithm that utilizes reference samples generated from bulk RNA sequencing data to identify cell types from single-cell transcriptomes. An anchor graph is constructed to depict the relationship between reference samples and cells. By applying a connectivity constraint to the learned graph, LFSC enables the preservation of the underlying cluster structure. Moreover, the overall complexity of LFSC is linear to the size of the data, which greatly improves effectiveness and efficiency. By applying LFSC to real single-cell RNA sequencing datasets, we discovered that it has superior performance over existing baseline methods in clustering accuracy and robustness. An application using infiltrating T cells in liver cancer demonstrates that LFSC can successfully find new cell types, discover differently expressed genes, and explore new cancer-associated biomarkers.
single-cell RNA-seq, anchor graph, Genetics, bulk RNA-seq, QH426-470, data integration, clustering
single-cell RNA-seq, anchor graph, Genetics, bulk RNA-seq, QH426-470, data integration, clustering
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
