ClinicalVerifier: A Retrieval-Augmented Pipeline for Detecting Guideline Contradictions in LLM-Generated Mental Health Text

Large language models (LLMs) are increasingly deployed in mental health applications, yet their outputs may silently contradict clinical guidelines, posing serious patient safety risks. This paper presents ClinicalVerifier, a retrieval-augmented generation (RAG) system that automatically detects when LLM-generated clinical text contradicts NICE and WHO evidence-based guidelines. The system combines a FAISS-indexed embedding store of guideline excerpts with an LLM judge (Llama-3.3-70B via Groq), augmented by a Neighbourhood Consistency Scoring (NCS) hallucination probe to produce calibrated combined risk levels (LOW / MEDIUM / HIGH). Guideline sources include NICE CG90, NG185, CG178, NG116, CG53, CG42, and the WHO mhGAP Intervention Guide 2023. Evaluated on a 30-case labelled benchmark spanning safe, uncertain, and contradicts clinical outputs, ClinicalVerifier achieves: 73.3% overall accuracy 95.2% F1 on safety-critical contradiction detection 100% recall on guideline-contradicting cases 100% precision on HIGH combined-risk alerts The system's conservative design ensures no guideline violations are missed, while the dual-signal architecture (RAG verdict + NCS score) minimises false alarms. The pipeline is fully open-source, runs without proprietary API access, and is designed for integration into clinical AI monitoring workflows as a sidecar service.

Keywords

WHO guidelines, FAISS, Llama, retrieval-augmented generation, contradiction detection, RAG, hallucination detection, LLM safety, guideline compliance, clinical AI monitoring, patient safety, large language models, clinical NLP, mental health AI, NICE guidelines

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now