
AbstractSingle-cell RNA sequencing (scRNA-seq) technologies facilitate the characterization of transcriptomic landscapes in diverse species, tissues, and cell types with unprecedented molecular resolution. In order to evaluate various biological hypotheses using high-dimensional single-cell gene expression data, most computational and statistical methods depend on a gene feature selection step to identify genes with high biological variability and reduce computational complexity. Even though many gene selection methods have been developed for scRNA-seq analysis, there lacks a systematic comparison of the assumptions, statistical models, and selection criteria used by these methods. In this article, we summarize and discuss 17 computational methods for selecting gene features in unsupervised analysis of single-cell gene expression data, with unified notations and statistical frameworks. Our discussion provides a useful summary to help practitioners select appropriate methods based on their assumptions and applicability, and to assist method developers in designing new computational tools for unsupervised learning of scRNA-seq data.
single-cell genomics, Bioinformatics, Sequence Analysis, RNA, Human Genome, Computational Biology, Gene Expression, Bioengineering, Computation Theory and Mathematics, unsupervised learning, feature selection, Networking and Information Technology R&D (NITRD), Genetics, RNA, Humans, Generic health relevance, Biochemistry and Cell Biology, Single-Cell Analysis, highly variable genes, Other Information and Computing Sciences, Sequence Analysis, Biotechnology
single-cell genomics, Bioinformatics, Sequence Analysis, RNA, Human Genome, Computational Biology, Gene Expression, Bioengineering, Computation Theory and Mathematics, unsupervised learning, feature selection, Networking and Information Technology R&D (NITRD), Genetics, RNA, Humans, Generic health relevance, Biochemistry and Cell Biology, Single-Cell Analysis, highly variable genes, Other Information and Computing Sciences, Sequence Analysis, Biotechnology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 33 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
