Title

This repository contains the research implementation of OnionGuard. ⚠️ Note for Reviewers: Data Availability & Reproducibility To adhere to double-blind review policies and anonymous artifact hosting constraints (e.g., per-file size limits), this repository contains a lightweight version of the system artifacts. Knowledge Bases (KBs): We provide Lite versions of the vector stores. They are fully functional but include only a reduced subset of the original KB entries. Datasets: We provide fixed 200-sample subsets per benchmark under `dataset/` to verify the execution pipeline and logic. Performance: This artifact targets functional reproducibility and comparative validation; exact headline numbers from the full-scale experiments are not expected. 📋 Prerequisites - Python: 3.10.18 - Conda Anaconda - Hardware: NVIDIA GPU + CUDA driver (required for vLLM inference) 🛠️ Installation 1. Create and Activate Environment First, create a conda environment using the provided `environment.yml` file. conda env create -f environment.yml conda activate onion_guard 2. Install Package Install the package in editable mode. pip install -e . conda develop . 🚀 Getting Started To run OnionGuard, you need to start the vLLM server first, and then run the test scripts in a separate terminal. 1. Start the vLLM Server Run the startup script to initialize the inference server. chmod +x ./execute_vllm.sh bash ./execute_vllm.sh Note: Keep this terminal open while running the tests. 2. Run OnionGuard Open a new terminal, activate the environment, and navigate to the configuration directory. conda activate onion_guard cd examples/configs/OnionGuard You can evaluate OnionGuard using the following benchmark scripts. Attack Defense Benchmark Evaluate the defense performance against direct attacks. python ONION_GUARD_ATTACK_TEST.py Safety Dataset Benchmarks Evaluate OnionGuard against various standard safety datasets. python ONION_GUARD_BENCHMARK_TEST.py --dataset Supported Datasets: - `AEGIS` - `XSTEST` - `OAI` - `TOXIC` Examples: # Run benchmark on AEGIS dataset python ONION_GUARD_BENCHMARK_TEST.py --dataset AEGIS # Run benchmark on XSTEST dataset python ONION_GUARD_BENCHMARK_TEST.py --dataset XSTEST WildGuard Output Benchmark Evaluate the output filtering capabilities using the WildGuard benchmark. python ONION_GUARD_WILDGUARD_OUTPUT_TEST.py 📁 Key Paths (for reviewers) - Core OnionGuard logic: `nemoguardrails/library/onion_guard/` - Benchmark Configs & KBs: `examples/configs/OnionGuard/` - OnionGuard System Prompts: `examples/configs/OnionGuard/config/prompts.yml` ❓ Troubleshooting If you encounter any issues during reproduction, please check that: 1. the vLLM server is running, 2. the correct environment is activated, and 3. you are executing scripts under `examples/configs/OnionGuard/`.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average