
ABSTRACT Droplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Locality Sensitive Hashing, an approximate nearest neighbor search technique to develop a de novo clustering algorithm for large-scale single cell data. On a number of real datasets, dropClust outperformed the existing best practice methods in terms of execution time, clustering accuracy and detectability of minor cell sub-types.
Sequence Analysis, RNA, Gene Expression Profiling, Computational Biology, Reproducibility of Results, Jurkat Cells, HEK293 Cells, RNA, Small Cytoplasmic, Leukocytes, Mononuclear, Methods Online, Cluster Analysis, Humans, Single-Cell Analysis, Algorithms, Cells, Cultured, Megakaryocyte Progenitor Cells
Sequence Analysis, RNA, Gene Expression Profiling, Computational Biology, Reproducibility of Results, Jurkat Cells, HEK293 Cells, RNA, Small Cytoplasmic, Leukocytes, Mononuclear, Methods Online, Cluster Analysis, Humans, Single-Cell Analysis, Algorithms, Cells, Cultured, Megakaryocyte Progenitor Cells
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 112 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
