Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Other literature type . 2025
License: CC BY
Data sources: Datacite
ZENODO
Other literature type . 2025
License: CC BY
Data sources: Datacite
ZENODO
Other literature type . 2025
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Evans' Law 5.0: Long-Context Degradation in Multimodal Models and the Cross-Modal Degradation Tax 5.0

Authors: Evans, Jennifer;

Evans' Law 5.0: Long-Context Degradation in Multimodal Models and the Cross-Modal Degradation Tax 5.0

Abstract

Evans’ Law v5.0 presents a comprehensive empirical investigation into long-context degradation in large language models (LLMs), introducing two validated scaling laws that quantify coherence collapse in both text-only and multimodal architectures. Through controlled cross-model testing across GPT-5.0/5.1, Claude 4.5, Gemini Pro/Flash, Grok 4.1, DeepSeek v3.2, Qwen, and Mixtral, the study identifies the relationship between model scale and effective coherent context, showing that usable context windows are far shorter than vendor-advertised capacities. The paper establishes that text-only coherence follows a sublinear law of the form: L_{\text{text}} \approx 1969.8 \times M^{0.74}, while multimodal coherence follows: L_{\text{multi}} \approx 582.5 \times M^{0.64}. The gap between these exponents demonstrates a robust, architecture-independent Cross-Modal Degradation Tax, typically reducing coherent reasoning depth by 60–80% when vision-language fusion is active. The paper also introduces a drift taxonomy (compression drift, expansion drift, logic-deferral drift, and code-layer leakage) that captures vendor-specific signatures preceding full collapse. A key finding is the architectural regression in GPT-5.0, which collapses at ~35–47K tokens despite a predicted threshold above 110K, and transitions into an “opaque coherence” failure mode where outputs remain fluent but semantically incorrect and difficult to detect. These opaque failures pose significant safety, reliability, and compliance risks for enterprise users, especially when long-context decomposition bypasses single-instruction safety filters. The work provides actionable implications for engineers, enterprises, and policymakers, arguing for validated coherence-limit reporting (e.g., “1M token window; coherence validated to 75K tokens”) as a necessary transparency norm. Evans’ Law v5.0 advances a reproducible empirical foundation for evaluating long-context reliability and motivates the development of a broader framework for Artificial Conversational Phenomenology.

Keywords

AI Scaling, AI Governance, Long-Context Performance, LLM Coherence, Multimodal AI, (4-(m-Chlorophenylcarbamoyloxy)-2-butynyl)trimethylammonium Chloride, AI Conversation Methodology, Drift signatures, Evans Law, AI Conversational Phenomenology

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!