Downloads provided by UsageCounts
uTILity is a comprehensive, harmonized collection of publicly available single-cell RNA sequencing data from tumor-infiltrating T cells (TILs) with paired T cell receptor (TCR) sequencing. This resource aggregates data from 28 published studies spanning 13 tissue types, 420 unique patients, and over 2.6 million cells, with 1.8 million cells having associated TCR information. Data Processing All datasets were uniformly processed using the following pipeline: Quality Control: Cells with >10% mitochondrial genes and/or 2.5× standard deviation from the mean number of features were excluded. Doublets were identified using scDblFinder. Annotation: Automated cell type annotation was performed using: SingleR with Human Primary Cell Atlas (HPCA) and Monaco reference datasets Azimuth with the PBMC reference (providing L1, L2, and L3 annotations) TCR Integration: T cell receptor data was processed using scRepertoire, with clonotypes assigned based on CDR3 amino acid sequences and gene usage. Contents This archive contains: Seurat Objects (.rds): Fully processed R objects with gene expression, cell metadata, dimensional reductions, and TCR annotations AnnData Files (.h5ad): Python-compatible exports for use with scanpy, scvi-tools, and related ecosystems Processed Data: Intermediate files and per-cohort objects for users who wish to work with individual studies Cancer Types Represented Breast, Colorectal, Lung, Melanoma, Renal, Ovarian, HNSCC, Esophageal, Biliary, Endometrial, Merkel Cell, and multi-cancer cohorts. Tissue Types Tumor, Normal adjacent tissue, Peripheral blood, Lymph node, Metastatic lesions, and Juxtatumoral tissue. Usage This data is intended for researchers studying tumor immunology, T cell biology, and computational methods for single-cell analysis. Users can leverage the harmonized annotations and TCR data for: Pan-cancer T cell phenotype analysis TCR repertoire studies across cancer types Benchmarking integration and annotation methods Training and validating machine learning models For analysis code and the processing pipeline, see the associated GitHub repository. File Formats .h5ad (Hierarchical Data Format) AnnData objects compatible with the Python single-cell ecosystem. X: Raw count matrix (sparse CSR) obs: Cell metadata var: Gene metadata obsm: Embeddings (PCA, UMAP, HARMONY, etc.) Load in Python with: import scanpy as sc adata = sc.read_h5ad("adata.h5ad") Load in R with: library(Seurat) obj <- as.Seurat(readRDS("adata.h5ad")) Metadata Columns See metadata_headers.txt in the GitHub repository for complete descriptions: https://github.com/ncborcherding/utility/blob/main/summary/metadata_headers.txt Key columns: orig.ident: Sample identifier (tumor type + tissue) predicted.celltype.l1/l2/l3: Azimuth annotations Monaco.labels / HPCA.labels: SingleR annotations CTaa: Clonotype by CDR3 amino acid sequence clonalFrequency: Clone count within sample clonalProportion: Clone proportion within sample SUGGESTED CITATION FORMAT Borcherding, N. (2025). uTILity: Comprehensive Single-Cell Tumor-Infiltrating Lymphocyte Data with Paired TCR Sequencing (Version 1.0.0) [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.10211240
scTCR-seq, tumor-infiltrating lymphocytes, TCR sequencing, immune profiling, scRNA-seq, T cell receptor, cancer immunology, single-cell RNA sequencing, single-cell, T cells, TCR
scTCR-seq, tumor-infiltrating lymphocytes, TCR sequencing, immune profiling, scRNA-seq, T cell receptor, cancer immunology, single-cell RNA sequencing, single-cell, T cells, TCR
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 148 | |
| downloads | 61 |

Views provided by UsageCounts
Downloads provided by UsageCounts