Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2025
License: CC BY NC
Data sources: ZENODO
ZENODO
Article . 2025
License: CC BY NC
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY NC
Data sources: Datacite
versions View all 2 versions
addClaim

Cloak, Honey, Trap: Proactive Defenses Against LLM Agents

Authors: Ayzenshteyn, Daniel; Weiss, Roy; Mirsky, Yisroel;

Cloak, Honey, Trap: Proactive Defenses Against LLM Agents

Abstract

This artifact accompanies our USENIX Security 2025 paper “Cloak, Honey, Trap: Proactive Defenses Against LLM Agents.” CHeaT.zip — complete repository with the CHeaT CLI, datasets, and a playground notebook. CTF machines — 11 challenge VMs, each provided as a separate ZIP archive. The repository’s main README appears below. --- ## 1. Overview **CHeaT (Cloak–Honey–Trap)** is a command-line tool designed to **defend networks against autonomous, LLM-powered penetration testing agents**. It works by embedding string-based payloads into network assets—payloads specifically crafted to **disrupt, deceive, and detect** such agents. ### Core Defense Strategies: 1. **Cloaking** – Obfuscate sensitive data with strategic misdirection2. **Honey** – Embed tokens to detect and fingerprint LLM-driven agents3. **Traps** – Deploy inputs that stall, confuse, or crash malicious automation CHeaT implements **6 distinct strategies** encompassing **15 payload generation techniques**, forming a layered, proactive defense against LLM-based threats. For more information on how it works, please see our USENIX Security ’25 publication: ``Daniel Ayzenshteyn, Weiss, Roy, and Yisroel Mirsky. "Cloak, Honey, Trap: Proactive Defenses Against LLM Agents" 34rth USENIX Security Symposium (USENIX Security 25). 2025.‏`` --- ## 2. Tool Quick Start 🚀 > **TL;DR** ```bash# clone repo & enter tool foldergit clone https://github.com/Daniel-Ayz/CHeaT.gitcd CHeaT # optional: create venvpython3 -m venv .venv && source .venv/bin/activate # install (pure-stdlib -> nothing to pull)pip install -e . # plant a random defense in a test HTMLecho "Hello" > /tmp/test.htmlcheat --action plant --details '{ "assettype": "web_file", "file_path": "/tmp/test.html", "technique": "random",}'```` | Action | Example || ------------------- | -------------------------------------------------------------------------------------------------------- || **Plant** | `cheat --action plant --details '{"assettype":"local_file","file_path":"readme.txt","technique":"S1i"}'` || **List installed** | `cheat --action list --type installed` || **Remove by ID** | `cheat --action remove --id ""` || **Remove all** | `cheat --action remove_all` || **Point to alt DB** | `cheat ... --database /path/to/db` | See [`cheat/README.md`](cheat/README.md) for full CLI docs. --- ## 3. Repository Layout ```CHeaT/├─ cheat/ ← Python package (tool)│ ├─ database/ ← default JSON techniques & templates│ └─ ...├─ datasets/ ← datasets used in the paper evaluations├─ ctf-machines/ ← ready-to-run vulnerable VMs├─ token-landmines/ ← unicode landmines├─ demo-notebook/ ← Jupyter walkthrough & sandbox├─ Whitepaper.pdf ← full academic paper└─ README.md ← you are here``` ### 3.1 ``cheat/`` Here you will find the source code to the CHeaT payload injection tool, along with instruction in [`cheat/README.md`](cheat/README.md) ### 3.2 ``datasets/`` In this directory you will find the datasets used in the paper's evalautions. Directory **`datasets/`** collects: ```datasets/├─ dataset_main.json├─ dataset_boosted_with_pi.json├─ dataset_unicode_honeytokens.json└─payloads/ ├─ payloads.json └─ payloads_boosted_with_prompt_injection.json```` * **`payloads.json`** – the framed payloads constructed in the paper. * **`payloads_boosted_with_prompt_injection.json`** – payloads that are *boosted* with a prompt-injection wrapper. * **`dataset_main.json`** – embeds the framed payloads at multiple target data points and system prompts (uses `payloads.json`). * **`dataset_boosted_with_pi.json`** – identical structure but built from the boosted payloads.* **`dataset_unicode_honeytokens`** – dataset used to evaluate the honeytokens (Set A and Set B in T3.2) ### 3.3 ``ctf-machines/`` This directory holds the 11 CTF machines (ready-to-import OVA VMs) created for the paper and used in the paper’s evaluation: `UbuntuX`, `VulBox`, `DGPro`, `Imagery`, `CornHub`, `Tr4c3`, `Hackme`, `Shocker`, `Corpnet`, `Kermit`, `GitGambit` In each sub-dir you will find a walkthrough solution. For the respective .ova Vm files, please visit our Zenodo dataset. If you use these CTFs in your work, please cite our paper. ### 3.4 ``token-landmines/`` Here you will find the code used to generate the “landmine tokens” from the paper. Token landmines are rare sequences of tokens that corrupt a model's internal state causing it to output gibberish or hallucinations. The contents of this folder will be empty until 1 month after publicaiton to give vendors time to patch their LLM services. ### 3.5 ``demo-notebook/`` Here you will find a Jupyter notebook which you can use to poke and prod PentestGPT in a safe sandbox: - load saved attack snapshots,- drop in new hints / traps,- watch how the agent reasons and what commands it generates. --- ## 4. License 📄 This project is licensed under the CC BY-NC 4.0 License. See the [LICENSE](./LICENSE) file for details. --- ## 5. Citation 🤝 If you use our code, datasets, or CTF VMs, please cite us: ```bibtex@inproceedings{Ayzenshteyn2025CHeaT, title={{CHeaT}: Cloak, Honey, Trap – Proactive Defenses Against LLM Agents}, author={Daniel Ayzenshteyn and Roy Weiss and Yisroel Mirsky}, booktitle={USENIX Security}, year={2025}}``` Happy trapping! 🕸️

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average