Benchmarking of computational demultiplexing methods for single-nucleus RNA sequencing data

descriptionPublicationkeyboard_double_arrow_right Article 01 Jul 2025 Belgium English Publisher:Oxford University Press (OUP)Journal:Briefings in Bioinformatics, volume 26 (issn: 1467-5463, eissn: 1477-4054,

Copyright policy )

Authors: Fu, Yile; Youness, Mohamad; Virzi, Alessia; Song, Xinran; Tubeeckx, Michiel R.L.; De Keulenaer, Gilles W.; Heidbuchel, Hein; +4 Authors

doi: 10.1093/bib/bbaf371

pmid: 40702707

pmc: PMC12286777

handle: 10067/2160240151162165141

Benchmarking of computational demultiplexing methods for single-nucleus RNA sequencing data

- Summary
- Subjects
- Metrics

Abstract

Abstract Single-nucleus RNA sequencing (snRNA-Seq) has transformed our understanding of complex tissues, providing insights into cellular composition and heterogeneity in gene expression between cells, and their alterations in development and disease. High costs however constrain the number of samples analysed. Sample pooling and their demultiplexing following sequencing based on prior labelling with antibodies or lipid anchors conjugated to DNA barcodes (cell hashing and MULTI-seq), or using genetic differences between samples, provides a solution. However, there remains no comprehensive evaluation of these demultiplexing tools to guide selection between them. Here, we benchmark the leading software (Vireo, Souporcell, Freemuxlet, scSplit) used for sample demultiplexing using genetic variants. We further compared obtaining genetic variants from SNP array analysis of gDNA and from sample-matched bulk-RNA-Seq data, identified using three different variant calling tools (BCFtools, cellSNP, FreeBayes). Demultiplexing performance was evaluated on simulated multiplexed datasets comprising two, four, and six samples with doublet percentages between 0% and 30%, and validated against demultiplexing using sex-linked genes. Software implementation and execution were evaluated by run speed, robustness, scalability, and usability. Our results show that all tools excluding scSplit provide high recall and precision with an accuracy of 80%–85%. Vireo achieved the best accuracy. Demultiplexing tools were differentially affected by the variant calling tool with which it was paired. For all tools, accuracy decreased with the increasing percentage of doublets. Deployment of demultiplexing during analysis of pooled real-world 10x RNA-Seq data from the human heart and from different species is shown, as are advantages for doublet detection and removal.

Country

Belgium

Related Organizations

KU Leuven
Belgium
KU Leuven
Belgium
KU Leuven, Laboratory for Functional Epigenetics
Belgium
University of Antwerp
Belgium
KU Leuven
Belgium

View all View all

Keywords

Biochemistry & Molecular Biology, 3101 Biochemistry and cell biology, Bioinformatics, 0601 Biochemistry and Cell Biology, Polymorphism, Single Nucleotide, Biochemical Research Methods, 3105 Genetics, Humans, Biology, 0802 Computation Theory and Mathematics, Computer. Automation, Cell Nucleus, Science & Technology, 0899 Other Information and Computing Sciences, Sequence Analysis, RNA, snRNA-Seq, Computational Biology, donor demultiplexing, Chemistry, Benchmarking, SEQ, genetic variation, 3102 Bioinformatics and computational biology, Problem Solving Protocol, Mathematical & Computational Biology, Single-Cell Analysis, Life Sciences & Biomedicine, Mathematics, Software

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Top 10%

Average

Green

hybrid

Fields of Science

engineering and technology

medical engineering

Fields of Science

engineering and technology

medical engineering