
Motivation: Deciphering gene regulatory networks (GRNs) from single-cell transcriptomics data remains a fundamental challenge in computational biology. It is hindered by data sparsity, high dimensionality, and the lack of scalable, general-izable inference models. To address this, we present GRNFormer, a generalizable graph transformer framework for ac-curate GRN inference from transcriptomics data across species, cell types, and platforms without requiring cell-type an-notations or prior regulatory information. Results: GRNFormer integrates a transformer-based gene expression encoder (Gene-Transcoder) with a variational graph autoencoder (GraViTAE) employing pairwise attention to jointly learn the representations of genes (nodes) and their co-expression relationships (edges). Leveraging TF-Walker, a transcription factor-anchored subgraph sampling strategy, it effectively captures gene regulatory interactions from either single-cell or bulk RNA-seq datasets. Benchmarking on standard datasets demonstrates that GRNFormer outperforms existing traditional and deep learning state-of-the-art meth-ods in blind evaluations, achieving average Sampled Area Under the Receiver Operating Characteristic Curve (Sam-pled_AUROC) and Sampled Area Under the Precision-Recall Curve (Sampled_AUPRC) values between 0.90 and 0.98 as well as 0.87-0.98 average Sampled F1 score. The model robustly recovers both known and novel regulatory networks, including pluripotency circuits in human embryonic stem cells (hESCs) and immune cell modules in Peripheral Blood Mononuclear Cells (PBMCs). The architecture enables scalable, biologically interpretable GRN inference across various datasets, cell types, and species, establishing GRNFormer as a robust and transferable tool for network biology.
GRNFormer v1.0.2-manuscript-archive — Archived Version Used in Manuscript Summary This tag archives the exact version of the codebase used to generate the results reported in the manuscript. Purpose Ensures reproducibility of the results described in the manuscript Preserves the precise code state used for experiments and analysis Allows readers and reviewers to access the same implementation used in the study No core functions and feature changes (fully backward compatible) Notes This version corresponds to the code used during manuscript result generation. No further modifications should be made to this archived tag.
Machine Learning, Gene Expression Regulation, Machine learning, Computational Biology, Gene Regulatory Networks
Machine Learning, Gene Expression Regulation, Machine learning, Computational Biology, Gene Regulatory Networks
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
