doi: 10.5281/zenodo.20574476
SMAT adds a learned semantic similarity matrix and a centrality-derived value gate to transformer attention. The gate is load-bearing, removing it collapses performance. Semantic information acts through gating, not direct attention biasing.