Totem - Runtime behavioral integrity verification for LLMs in decision support systems via salted probing and cryptographic attestation

Totem is a backend-agnostic proxy for runtime behavioral integrity verification of Large Language Models deployed as decision support systems. While existing LLM security tools protect against malicious users, none verify whether the model itself has been compromised. Totem fills this gap through three mechanisms: behavioral profiling via refusal classification, salted probes with steganographic triggers to prevent selective evasion, and cryptographically signed behavioral baselines (Model Manifest) authenticated via Ed25519 digital signatures. Evaluated across three model families and two attack vectors, Totem achieves 70% must-refuse detection rate on uncensored model swaps with 0% false positives, while naive baselines detect 0% of the same swaps. Released as open-source software under the Apache 2.0 license.

Keywords

runtime verification, model integrity, llm security, behavioral probing, guardrail bypass detection, decision support systems, cryptographic attestation

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average