Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Geometric Phase Transitions in Transformers: The Instant Transformation from Euclidean to Hyperbolic Space

Authors: Gardner, James;

Geometric Phase Transitions in Transformers: The Instant Transformation from Euclidean to Hyperbolic Space

Abstract

Abstract NOTE: This work is not peer reviewed. Please provide any feedback in the github linked below via an issue, or contacting me via email. We reveal that transformer networks undergo discrete geometric phase transitions, instantly converting Euclidean word embeddings into hyperbolic representations through self-attention. Layer-by-layer analysis shows embeddings start with 0% reverse triangle inequality violations (perfect Euclidean geometry) but after the first attention layer exhibit 100% violations, indicating complete hyperbolic structure. Key Findings: Instant phase transition: Layer 0 shows δ/⟨dist⟩ = 0.0000 (Euclidean), Layer 1 shows δ/⟨dist⟩ = 0.043 (hyperbolic) - a 4× jump with 95% confidence intervals [0.0398, 0.0468] Boundary avoidance mechanisms: Softmax prevents infinite distances, LayerNorm constrains radii (constant 2.647 ± 0.0004 across all layers), and residual connections limit drift Architectural explanation: This geometric transition explains both transformer capabilities (hierarchical reasoning native to hyperbolic space) and limitations (counting/iteration failures due to constrained helical navigation paths) Methodology: Rigorous statistical analysis with 1000 bootstrap iterations, proper hyperbolic metrics using Poincaré ball distances, and validation across multiple models (BERT, GPT-2, MiniLM). Robustness confirmed with and without normalization. This work provides the first systematic measurement of geometric structure in transformer layers, revealing fundamental architectural constraints that govern both the power and limitations of these models Complete reproducible code and data included. Research conducted with AI assistance. All results independently verified. --- Keywords: transformer networks, geometric phase transitions, hyperbolic geometry, attention mechanisms, neural architecture analysis

Keywords

transformer networks, attention mechanisms, geometric phase transitions, neural architecture analysis, hyperbolic geometry

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author? Do you have the OA version of this publication?