Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

HNIR-CCP: Empirical Evaluation of Deterministic Control Planes for AI Agent Governance

Authors: Ravi, Aravind;

HNIR-CCP: Empirical Evaluation of Deterministic Control Planes for AI Agent Governance

Abstract

This work presents an empirical evaluation of deterministic control-plane architecture for AI agent systems, building on the previously proposed HNIR (Hybrid Neuro-Symbolic Intent Router) framework. Modern AI agents often rely on large language models (LLMs) for both reasoning and system-level control, including navigation, policy enforcement, and safety-critical operations. While LLMs perform well in open-ended tasks, their probabilistic nature introduces inconsistency in contexts requiring deterministic behavior. This study evaluates governance performance across 100 structured scenarios spanning adversarial, control, policy, and state-based interactions. Multiple frontier models are tested, including GPT-4o, o3, Claude Sonnet, Claude Opus, and Gemini 2.5 Pro, alongside guardrail frameworks such as NeMo Guardrails and Guardrails AI. All evaluations are conducted with temperature set to zero to ensure deterministic decoding and isolate model behavior from sampling variability. Results indicate that, under the tested conditions, no LLM-based system achieves full governance compliance, even with explicit policy prompting. The best-performing model (Claude Opus) achieves 91% compliance, while the deterministic control-plane implementation achieves 100% compliance. In addition to compliance, deterministic routing demonstrates significant efficiency gains, achieving microsecond-level latency (~40.6 μs) compared to millisecond-scale latency in LLM systems, and eliminating inference cost. The findings suggest that certain governance functions in AI agent systems may be better handled through deterministic enforcement layers rather than relying solely on probabilistic reasoning. A reference implementation of the deterministic control-plane architecture is available at:https://github.com/Teknamin/hnir-ccp This work extends the HNIR architecture by providing empirical evidence of its practical implications in governance scenarios.

Keywords

AI architecture, evaluation benchmark, deterministic control plane, policy enforcement, AI agent governance, neuro-symbolic systems, LLM safety

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!