Forensic Provenance for LLM Deployments: Real-Time Activation Watermarking for Legal Non-Repudiation

Vargas Altalaguerri, Jose Joaquín

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

Forensic Provenance for LLM Deployments: Real-Time Activation Watermarking for Legal Non-Repudiation

descriptionPublicationkeyboard_double_arrow_right Preprint Under curationPublisher:Zenodo

Authors: Vargas Altalaguerri, Jose Joaquín;

doi: 10.5281/zenodo.20570544

Forensic Provenance for LLM Deployments: Real-Time Activation Watermarking for Legal Non-Repudiation

- Summary

Abstract

A white-box activation-watermarking system for forensic provenance and legal non-repudiation of LLM generations: each generation is tagged in the residual stream with a per-session payload using key-derived sign codes, at an amplitude that leaves the text unchanged (KL≈3e-4). From the provider's activation logs the payload is recovered and a binomial test yields a legal-grade provenance statement (attribution) or its absence (exculpation). The watermark is recoverable from activations, not text: the non-linear unembedding scrambles it before the logits—the same physics that kills a copyright watermark makes the forensic one sound. The appendix annexes the supporting study: engineered CDMA superposition survives only in the linear channel or with a trained de-multiplexer and collapses at every untrained non-linear readout (shown across six applications); per-token surprise predicts multiplexability and language's per-token demand collapses the capacity to K≈2. Code and experiments: https://github.com/ttzrs/neural-cdma

Found an issue? Give us feedback