
This paper investigates the phenomenon of grokking in transformers across a broader class of algebraic structures beyond modular addition. Prior mechanistic interpretability work has shown that transformers trained on modular addition learn Fourier-based clock circuits and exhibit delayed generalisation (grokking). We extend this analysis to eight algebraic operations spanning abelian groups, a composite ring, and non-abelian groups (S3, D5, A4, S4), using 1-layer transformers at d_model = 64. Our key findings are: 1. A clear abelian vs non-abelian grokking boundary: all abelian operations achieve 100% test accuracy, while non-abelian groups fail to generalise despite perfect training accuracy.2. Discrete-log re-indexing improves Fourier concentration for modular multiplication (2.14×), supporting the discrete logarithm representation hypothesis.3. Non-abelian models exhibit partial circuit formation via Peter–Weyl decomposition even without grokking.4. Cross-operation embedding similarity (CKA ≥ 0.80 across all pairs) suggests a shared representational substrate.5. A capacity-dependent interpretation: abelian tasks rely on 1D irreducible representations, while non-abelian tasks require higher-dimensional irreps exceeding model capacity at d_model = 64. All experiments are reproducible via provided code and checkpoint-resume pipelines, runnable on a free Colab T4 GPU (~3 hours). This work contributes new empirical evidence toward understanding the role of algebraic structure and representation theory in neural network generalisation. Code repository: https://github.com/justbytecode/grokking-beyond-addition
mechanistic interpretability, representation learning, grokking, group theory, non-abelian groups, transformers, deep learning theory, Fourier analysis
mechanistic interpretability, representation learning, grokking, group theory, non-abelian groups, transformers, deep learning theory, Fourier analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
