AI Security in the Era of Large Language Models: Red Teaming, Jailbreaking, and Prompt Injection

The rapid deployment of Large Language Models (LLMs) across enterprise and consumer applications has introduced a novel class of security vulnerabilities that challenge conventional cybersecurity paradigms. This paper examines the emerging AI security landscape through three interconnected lenses: the transplantation of the Red Team / Blue Team methodology from classical information security into the domain of AI; the taxonomy and evolution of jailbreaking techniques used to circumvent LLM safety alignment; and prompt injection—now ranked as the #1 threat in the OWASP Top 10 for LLM Applications (2025)—as the dominant unsolved attack vector threatening LLM-integrated systems. We further analyze the intersection of traditional vulnerability classes such as Insecure Direct Object Reference (IDOR) with AI-augmented offensive tooling, the threat of knowledge-base poisoning in Retrieval-Augmented Generation (RAG) systems, and the current state of AI-as-firewall defensive architectures. Our analysis synthesizes practitioner observations and peer-reviewed findings to argue that AI security constitutes an unending adversarial cycle—one that demands continuous, layered, defense-in-depth strategies rather than static, checklist-driven postures.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average