Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

From Artifacts to Risk: Auditing Instruction Surfaces in Agent Systems

Authors: Gordeychik, Sergey;

From Artifacts to Risk: Auditing Instruction Surfaces in Agent Systems

Abstract

Agentic systems increasingly rely on persistent instruction artifacts, tool integrations, and repository-level configuration that shape behavior beyond individual prompts. Prior work has established prompt injection, indirect instruction attacks, tool poisoning, and agent hijacking as practical security concerns. Less attention, however, has been given to the repository layer as a persistent and auditable source of agent behavior. This paper presents a bottom-up, artifact-centric audit of instruction surfaces in agent systems. We analyze a purposive corpus of 509 instruction-rich repositories containing agent guidance files, skills, plugin manifests, and Model Context Protocol (MCP) related artifacts. The scan produced 4,882 medium-or-higher raw findings and 4,637 clustered issue instances. The contribution is not a new prompt-injection benchmark or a replacement for existing scanners. Instead, this study integrates heterogeneous signature sources, applies them to real repositories, correlates raw detections into artifact-level issue instances, and maps the resulting evidence to an ASAMM-aligned agent-security interpretation layer. We explicitly treat detector outputs as candidate evidence rather than proof of exploitability. The paper positions instruction surfaces as repository-level control-plane artifacts and argues that agent security practice needs artifact-level auditing alongside runtime testing and defense.

Keywords

Artificial intelligence, Computer security

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!