<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Abstract NOTE: This work is not peer reviewed. Please provide any feedback in the github linked below via an issue, or contacting me via email. We reveal that transformer networks undergo discrete geometric phase transitions, instantly converting Euclidean word embeddings into hyperbolic representations through self-attention. Layer-by-layer analysis shows embeddings start with 0% reverse triangle inequality violations (perfect Euclidean geometry) but after the first attention layer exhibit 100% violations, indicating complete hyperbolic structure. Key Findings: Instant phase transition: Layer 0 shows δ/⟨dist⟩ = 0.0000 (Euclidean), Layer 1 shows δ/⟨dist⟩ = 0.043 (hyperbolic) - a 4× jump with 95% confidence intervals [0.0398, 0.0468] Boundary avoidance mechanisms: Softmax prevents infinite distances, LayerNorm constrains radii (constant 2.647 ± 0.0004 across all layers), and residual connections limit drift Architectural explanation: This geometric transition explains both transformer capabilities (hierarchical reasoning native to hyperbolic space) and limitations (counting/iteration failures due to constrained helical navigation paths) Methodology: Rigorous statistical analysis with 1000 bootstrap iterations, proper hyperbolic metrics using Poincaré ball distances, and validation across multiple models (BERT, GPT-2, MiniLM). Robustness confirmed with and without normalization. This work provides the first systematic measurement of geometric structure in transformer layers, revealing fundamental architectural constraints that govern both the power and limitations of these models Complete reproducible code and data included. Research conducted with AI assistance. All results independently verified. --- Keywords: transformer networks, geometric phase transitions, hyperbolic geometry, attention mechanisms, neural architecture analysis
transformer networks, attention mechanisms, geometric phase transitions, neural architecture analysis, hyperbolic geometry
transformer networks, attention mechanisms, geometric phase transitions, neural architecture analysis, hyperbolic geometry
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |