
Drift Detection: When an AI's Values Shift | Geometry of Trust | Mathematics - Lesson 3 A single measurement tells you what an AI values right now. But what happens over thousands of prompts? Are the values stable — or are they drifting? In this talk we build continuous monitoring on top of the ruler and probes from Parts 1 and 2. Same causal Gram matrix, same probes, every prompt. The system builds a statistical baseline using Welford's online algorithm, then watches for deviations. When something shifts beyond a governance-defined threshold, it creates a signed, hash-linked alert that nobody can delete or alter after the fact. We walk through a complete worked example: building a baseline over 50 prompts, monitoring through prompts 51–100, then catching a sharp drop in honesty at prompt 101. The alert fires, the attestation is signed, and the chain creates a tamper-evident audit trail from BASELINE → SNAPSHOT → ALERT.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
