Single-cell type annotation with deep learning in 265 cell types for humans

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Jan 2024 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics Advances, volume 4 (eissn: 2635-0041,

Copyright policy )Funded by:NIH | AIM-AHEAD Coordinating Ce...

Authors: Sherry Dong; Kaiwen Deng; Xiuzhen Huang;

doi: 10.1093/bioadv/vbae054

pmid: 38645719

pmc: PMC11031354

Single-cell type annotation with deep learning in 265 cell types for humans

- Summary
- Subjects
- Metrics

Abstract

Abstract Motivation Annotating cell types is a challenging yet essential task in analyzing single-cell RNA sequencing data. However, due to the lack of a gold standard, it is difficult to evaluate the algorithms fairly and an overfitting algorithm may be favored in benchmarks. To address this challenge, we developed a deep learning-based single-cell type prediction tool that assigns the cell type to 265 different cell types for humans, based on data from approximately five million cells. Results We achieved a median area under the ROC curve (AUC) of 0.93 when evaluated across datasets. We found that inconsistent labeling in the existing database generated by different labs contributed to the mistakes of the model. Therefore, we used cell ontology to correct the annotations and retrained the model, which resulted in 0.971 median AUC. Our study reveals a limiting factor of the accuracy one may achieve with the current database annotation and points to the solutions towards an algorithm-based correction of the gold standard for future automated cell annotation approaches. Availability and implementation The code is available at: https://github.com/SherrySDong/Hierarchical-Correction-Improves-Automated-Single-cell-Type-Annotation. Data used in this study are listed in Supplementary Table S1 and are retrievable at the CZI database.

Related Organizations

University of Michigan–Flint
United States
University of Michigan–Ann Arbor
United States
Skyline High School
United States
Cedars-Sinai Medical Center
United States
University of Michigan
United States

Keywords

Original Article

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average