Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Research
Data sources: ZENODO
addClaim

Filtration-Gated Artificial Intelligence: Regulation Before Authority in Trustworthy AI Systems

Authors: Warwick, Carl;

Filtration-Gated Artificial Intelligence: Regulation Before Authority in Trustworthy AI Systems

Abstract

This working paper proposes a filtration-gated control architecture for building trustworthy artificial intelligence systems. Rather than relying on post-hoc safeguards, the framework introduces deterministic regulatory layers that constrain probabilistic reasoning before output and action occur. The architecture integrates filtration, reasoning permission, evidence validation, bias regulation, authority gating, memory control, and auditability into a bounded system design. The central claim is that trust in AI should not be derived from capability alone, but from the presence of structured, testable control mechanisms that regulate when and how intelligence is permitted to operate. The paper presents the architecture, explores failure modes, and outlines testable criteria for trustworthy AI. An applied prototype (Professor Santi) is referenced as a development environment for exploring these principles, though implementation details remain outside the scope of this work. This document is a working paper intended for feedback, discussion, and further development.

Powered by OpenAIRE graph
Found an issue? Give us feedback