Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Other literature type . 2025
License: CC BY
Data sources: Datacite
ZENODO
Other literature type . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Does Agentic AI Exist? A Cross-Vendor Investigation (Evans Law v6.0)

Authors: Evans, Jennifer;

Does Agentic AI Exist? A Cross-Vendor Investigation (Evans Law v6.0)

Abstract

The AI industry has rapidly adopted the term "agentic AI" to describe large language models that can autonomously execute multi-step tasks, maintain state across extended interactions, and operate with minimal human supervision. This paper presents evidence that autonomous agentic capability, as marketed, does not currently exist in any major commercial LLM system. Through systematic cross-vendor testing of five leading AI platforms—OpenAI's GPT, Anthropic's Claude, xAI's Grok, DeepSeek, and Google's Gemini—we demonstrate that what is marketed as "agentic AI" is either (1) sophisticated tool-use within supervised conversational sessions, or (2) traditional software systems with LLM components, where the engineering infrastructure, not the AI, provides the agentic behavior. We argue that "agentic AI" represents a category error: conflating LLM capability with complete system capability. We introduce three theoretical tools: (1) Evans' Law for Agentic AI (Cₐ = L × S(t,e) × U(v)), which predicts collapse at discrete operational thresholds—Launch (83% failure rate), Sustain (memory-free drift), and Upgrade (100% failure rate); (2) The Evans Ratio (E = Cp/Cd), which quantifies the balance between probabilistic intelligence and deterministic control; and (3) The Brock Threshold (E = 1), which defines the boundary between sophisticated automation and true agency. Analysis of production deployments, including Booking.com's customer service agent, Klarna's support automation, and Salesforce's Agentforce, reveals that successful "agentic" systems achieve reliability through extensive code-based constraints that limit LLM operation to narrow, carefully bounded tasks. These systems consistently measure E < 0.3 on the Evans Ratio (E = Cp/Cd), indicating that deterministic scaffolding contributes 3-10x more to autonomous behavior than the probabilistic core - all operating far below the Brock Threshold (E = 1), the boundary where true agency would emerge. We conclude with a call for evidence-based terminology and propose evaluating LLMs as components within engineered systems rather than as autonomous agents. The question is not whether LLMs will become truly autonomous agents in the future, but whether we can be precise about what they are today.

Keywords

Coherence Collapse, Long-Context Degradation, Transformer capabilities, Corporate AI, AI safety, Transformers, Agentic AI, Brock Threshold, LLMs, Evans Ratio, Evans Law, AI agents, AI policy

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average