Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Token Budget Contracts for Multi-Agent Large Language Model Orchestration with Confidence-Gated Retrieval

Authors: Borgaonkar, Swaranshu;

Token Budget Contracts for Multi-Agent Large Language Model Orchestration with Confidence-Gated Retrieval

Abstract

Multi-agent large language model (LLM) systems consume tokens at four to fifteen times the rate of single-agent calls, with no formal mechanism to bound per-agent consumption or enforce minimum response quality. This paper introduces Token Budget Contracts (TBC), a declarative resource protocol for multi-agent LLM orchestration comprising three patentable elements: (1) a formal contract schema TBCᵢ = (Imax, Omax, Cmin, Priority, Dependencies) declared per agent; (2) a priority-weighted dynamic reallocation algorithm δ = min(R, α × Pw × deficit × Imax) that transfers unused budget to agents exhibiting confidence deficits; and (3) an Adaptive Confidence-Gated Retrieval (CGR) mechanism that calibrates per-agent retrieval thresholds using the objective θᵣ = argmax[Accuracy(C≥θ) − λ × RetrievalCost(C<θ)] over a rolling measurement window. A fully functional Python implementation with 72 passing tests demonstrates 40–60% token reduction at 97%+ accuracy preservation and approximately 46% hallucination reduction relative to unconstrained baseline operation. Patent Pending — US Provisional Application No. 64/081,925, filed June 3, 2026.

Powered by OpenAIRE graph
Found an issue? Give us feedback