
Overview Using public 10x Xenium spatial transcriptomic data of breast cancer, we exploit the cell-stacking phenomenon in the Z-axis — traditionally treated as a technical artifact — as a genuine physical signal. We develop a grid-based, multi-scale texture analysis framework built on Z-axis stratification statistics. The pipeline proceeds as follows: Transcripts are binned into spatial grids after quality control. A robust baseline correction (RANSAC + Huber regression with automatic linear/quadratic model selection via AIC, validated by Moran's I on residuals) removes global geometric trends from the Z-axis. At multiple Gaussian kernel scales (σ = 15, 30, 45 μm), three continuous physical fields are computed per grid: transcription molecule density (ρ), overall Z-dispersion (z_std_all), and imbalance-enhanced upper–lower Z-dispersion difference (z_std_diff_enhanced). Edge correction and confidence weighting are applied throughout. Using only these geometry-derived features — with no pathological partitioning, cell-type labels, or gene expression input — we perform unsupervised classification via diagonal-covariance GMM with Potts-model MRF spatial smoothing. The number of clusters (K) is selected by bootstrap stability + ICL, and the smoothing strength (λ) is chosen by a stability–boundary-ratio objective over a sigma × lambda sensitivity grid. A leakage guard formally verifies that no biological or expression features enter the classification stage. Post-classification biological validation is conducted entirely downstream: grid-level count matrices and CPM are constructed, per-cluster marker ranking (vectorized Wilcoxon one-vs-rest) and pairwise differential gene expression (Mann–Whitney U with BH correction) are performed, followed by pathway enrichment (MSigDB Hallmark, GO BP, KEGG) via gseapy. Marker-group scoring (log1p mean CPM of curated gene panels) with Cohen's d effect sizes quantifies functional differences between clusters. Spatial interface analysis computes signed distances to the cluster boundary, constructs interface gradient heatmaps (z-scored feature profiles binned by distance), and derives interface strength/sharpness metrics (contrast Cohen's d, near-boundary slope, maximum gradient, AUC separation). A radius sensitivity sweep with partial Spearman correlations (controlling for transcript density) confirms that the Z-dispersion–density and Z-dispersion–heterogeneity associations are robust across neighborhood scales. A panel-restricted DGE and its own pathway enrichment provide a focused validation on biologically curated gene sets. 概览 基于公开的10x Xenium乳腺癌空间转录组数据,我们将Z轴上的细胞堆叠现象——传统上被视为技术误差来源——反向利用为真实的物理信号,开发了一套基于Z轴分层统计的网格化多尺度纹理分析框架。 流程如下: 质控后将转录本分配至空间网格。通过稳健基线校正(RANSAC + Huber回归,AIC自动选择线性/二次模型,Moran's I验证残差空间自相关)去除Z轴的全局几何趋势。 在多个高斯核尺度(σ = 15、30、45 μm)下,为每个网格计算三个连续物理场:转录分子密度(ρ)、整体Z离散度(z_std_all)、以及经不平衡增强的上下Z离散度差异(z_std_diff_enhanced)。全程施加边缘校正与置信度加权。 仅使用上述几何衍生特征——不涉及任何病理分区、细胞类型标签或基因表达信息——通过对角协方差GMM结合Potts模型MRF空间平滑进行无监督聚类。簇数K由bootstrap稳定性+ICL联合选择,平滑强度λ通过sigma × lambda敏感性网格上的稳定性-边界比目标函数确定。 正式验证分类阶段未引入任何生物学或表达特征。 分类后的生物学验证完全在下游进行:构建网格级计数矩阵与CPM,执行逐簇标记基因排序(向量化Wilcoxon一对其余)和成对差异基因表达(Mann–Whitney U + BH校正),随后通过gseapy进行通路富集(MSigDB Hallmark、GO BP、KEGG)。标记基因组评分(策划基因面板CPM均值的log1p)配合Cohen's d效应量,量化簇间功能差异。 空间界面分析计算到簇边界的有符号距离,构建界面梯度热图(按距离分箱的z-score特征谱),并推导界面强度/锐度指标(对比度Cohen's d、近边界斜率、最大梯度、AUC分离度)。半径敏感性扫描结合偏Spearman相关(控制转录密度)确认Z离散度与密度、Z离散度与异质性的关联在不同邻域尺度下稳健成立。 面板限定的差异表达及其独立通路富集,在生物学策划基因集上提供聚焦验证。
Markov random field, spatial transcriptomics, interface analysis, cell stacking, pathway enrichment, KDE, Xenium, spatial autocorrelation, unsupervised clustering, label-free segmentation, Z-axis stratification, Gaussian mixture model, Tumor Microenvironment, Potts model, grid-based texture analysis, DGE, signed distance
Markov random field, spatial transcriptomics, interface analysis, cell stacking, pathway enrichment, KDE, Xenium, spatial autocorrelation, unsupervised clustering, label-free segmentation, Z-axis stratification, Gaussian mixture model, Tumor Microenvironment, Potts model, grid-based texture analysis, DGE, signed distance
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
