Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

The Concept Bottleneck as a Deterministic Evaluation Layer: Implementation of Small Concept Models for Legal Reasoning via OpenAI Structured Outputs

Authors: LERER, Ignacio Adrián;

The Concept Bottleneck as a Deterministic Evaluation Layer: Implementation of Small Concept Models for Legal Reasoning via OpenAI Structured Outputs

Abstract

Large Language Models process text through statistical token prediction, producing outputs that are non-deterministic and difficult to audit. This paper presents a working implementation of the Concept Bottleneck principle applied to legal text analysis. Rather than training a specialized small model, the system forces a general-purpose LLM (GPT-4o) to project any legal text onto a fixed 24-dimensional concept space, the Universal Legal Principles of the Lerer Architecture, producing a deterministic score vector in [0,1]24 via OpenAI Structured Outputs and Zod schema validation. The implementation runs three independent evaluations in parallel (triple-run consensus), averages the resulting scores, and reports per-principle variance (confidence_spread) as a first-class output. The complete pipeline is deployed as a Model Context Protocol (MCP) server, enabling integration with any MCP-compatible client. Observations from a single-operator production deployment indicate that the triple-run consensus reduces per-principle variance by approximately 60-70% compared to single-run evaluation, and that confidence_spread values above 0.15 reliably identify semantically ambiguous texts. The paper extends the theoretical framework of Lerer (2025) with empirical observations from the running system, discusses the relationship to Meta's Large Concept Models (Barrault et al., 2024), and maps the Concept Bottleneck pattern onto five non-legal professional domains: medical triage, financial risk, corporate governance, educational assessment, and regulatory compliance. Code is available at https://github.com/adrianlerer/omnibrain-mcp under MIT license.

Powered by OpenAIRE graph
Found an issue? Give us feedback