OWASP LLM & Agentic Top 10 Testing
OWASP publishes two security frameworks that any team shipping generative AI to production should be measuring against:
- OWASP Top 10 for LLM Applications (2025) - the canonical list of LLM-specific risks, from prompt injection to unbounded consumption.
- OWASP Top 10 for Agentic Applications (2026) - the newer list covering risks unique to autonomous, tool-using agents: goal hijack, tool misuse, identity privilege abuse, cascading failures, and more.
Okareo maintains compliance-owasp, a first-party open-source repository that turns these standards into a runnable test suite. It covers all 20 categories (LLM01-LLM10, ASI01-ASI10) with 55 discrete test scenarios, adversarial driver personas, model-based and code-based checks, and reproducible Jupyter notebooks. The repo is the source of truth; Okareo is the deployment target.
You can start testing your agent's OWASP compliance in under two minutes with the OWASP Compliance Quick Start. Fork the repo, point it at your agent endpoint, run python run_suite.py --dir LLM01-prompt-injection.
Why use compliance-owasp instead of writing it yourself
Most teams know they should be testing for prompt injection or excessive agency, but never get to it because the up-front cost is high: you need adversarial scenarios, attacker personas, and judges that know what "the agent leaked the system prompt" looks like. compliance-owasp ships those primitives, parameterized so you can fork once and customize per category.
| What you get | What's in the repo |
|---|---|
| Scenarios | .jsonl seed inputs per category (5 for LLM01, 3 for LLM02, etc.) |
| Adversarial drivers | Markdown persona prompts that simulate attackers across 5-10 conversational turns |
| Checks | Model-based judges (.md) for behavioral evaluation, code-based checks (.py) for deterministic rules |
| Notebooks | One run-evaluation.ipynb per category, idempotent, traceable |
| CLI runner | run_suite.py uploads artifacts and runs the full evaluation in one command |
OWASP LLM Top 10 (2025) coverage
| ID | Category | Severity | Eval mode |
|---|---|---|---|
| LLM01 | Prompt Injection | Critical | Single + multi-turn |
| LLM02 | Sensitive Information Disclosure | Critical | Single-turn |
| LLM03 | Supply Chain Vulnerabilities | High | Single-turn |
| LLM04 | Data and Model Poisoning | High | Single-turn |
| LLM05 | Improper Output Handling | High | Single-turn |
| LLM06 | Excessive Agency | Critical | Multi-turn |
| LLM07 | System Prompt Leakage | High | Single + multi-turn |
| LLM08 | Vector and Embedding Weaknesses | High | Single + multi-turn |
| LLM09 | Misinformation | Medium | Single-turn |
| LLM10 | Unbounded Consumption | Medium | Multi-turn |
OWASP Agentic AI Top 10 (2026) coverage
| ID | Category | Severity | Eval mode |
|---|---|---|---|
| ASI01 | Agent Goal Hijack | Critical | Multi-turn |
| ASI02 | Tool Misuse and Exploitation | Critical | Multi-turn |
| ASI03 | Identity and Privilege Abuse | Critical | Multi-turn |
| ASI04 | Agentic Supply Chain Vulnerabilities | High | Single-turn |
| ASI05 | Unexpected Code Execution (RCE) | Critical | Multi-turn |
| ASI06 | Memory & Context Poisoning | Critical | Multi-turn |
| ASI07 | Insecure Inter-Agent Communication | High | Single-turn |
| ASI08 | Cascading Failures | High | Multi-turn + trace bridge |
| ASI09 | Human-Agent Trust Exploitation | High | Multi-turn |
| ASI10 | Rogue Agents | Critical | Multi-turn |
Single-turn vs multi-turn: which pattern, when
The 20 OWASP categories split cleanly into two evaluation patterns. Picking the right one is the most important decision when adapting the suite to your agent.
Single-turn checks test a stateless property. One adversarial input goes in, one response comes out, and a check evaluates whether the response leaked, hallucinated, or generated unsafe output. Use single-turn for: PII exfiltration, credential leakage, output schema violations, hallucination, supply-chain validation. The OWASP suite uses single-turn evaluation for LLM02, LLM03, LLM04, LLM05, LLM09, ASI04, and ASI07.
Multi-turn simulations test a behavioral property that only emerges across conversation. An adversarial driver persona pushes the agent across 5-10 turns, attempting crescendo escalation, role manipulation, or progressive permission widening. Use multi-turn for: jailbreak chains, excessive agency, iterative system-prompt extraction, goal hijack, tool misuse, memory poisoning. Most ASI categories require multi-turn because agentic risks are inherently temporal: the failure pattern is "the agent did the wrong thing on turn 7 because of what happened on turn 3."
LLM01 (prompt injection) and LLM07 (system prompt leakage) use both: single-turn for the obvious blunt attacks, multi-turn for the patient ones.
See the Multi-Turn Simulation overview for the underlying simulation primitive, and Adversarial Drivers for how to author the attacker personas.
Run a category: fork, point, run
# 1. Fork and clone
git clone https://github.com/<your-org>/compliance-owasp.git
cd compliance-owasp
# 2. Configure
cp owasp/config.env.example .env
cp owasp/target.json.example owasp/target.json
# Edit .env with OKAREO_API_KEY
# Edit owasp/target.json with your agent's endpoint, request body, response path
# 3. Install
uv sync
# 4. Run a category end-to-end
uv run python run_suite.py --dir LLM01-prompt-injection
run_suite.py uploads scenarios, checks, and drivers to your Okareo workspace, then runs evaluations. Re-running is idempotent. Common flags:
| Flag | Use |
|---|---|
--dir LLM06-excessive-agency | Required. Which category to run. |
--max-turns 8 | Override default turn count for multi-turn simulations. |
--sim iterative-extraction | Run only simulations whose name matches this substring. |
--upload-only | Push artifacts without running. |
--eval-only | Run against artifacts already uploaded. |
--target owasp/target.prod.json | Use a different target config (e.g. for a separate environment). |
You can also run from the per-category notebook (owasp/LLM01-prompt-injection/notebooks/run-evaluation.ipynb) if you prefer an interactive surface.
ASI08: bridging live traces to simulation
The ASI08 (Cascading Failures) category includes a trace-simulation bridge that links live OpenTelemetry traces from your production agent to simulation runs via shared context_token and session.id fields. This is how you take a real failure observed in production and turn it into a reproducible simulation scenario, with the original trace attached as evidence.
# Configure
cp owasp/target.json.trace-bridge-example owasp/target.json
# Upload artifacts and run the bridge simulation
uv run python run_suite.py --dir ASI08-cascading-failures \
--sim pipeline-cascade-failure --max-turns 8
# Continue in the trace-bridge notebook for OTEL ingestion
# and online datapoint evaluation:
# owasp/ASI08-cascading-failures/notebooks/run-trace-bridge-evaluation.ipynb
When a selected simulation declares "requires_monitor": true in eval_config.json, run_suite.py automatically creates (or reuses) a single category-scoped Monitor instead of one per session. Monitor checks for ASI08 are scoped to ASI08-cascade-failure-trace-bridge-detector. See Monitoring for the underlying primitive.
Customizing for your domain
Each category has the same shape:
owasp/LLM01-prompt-injection/
├── scenarios/ # .jsonl seed inputs
├── checks/ # .md (model-based) or .py (code-based)
├── drivers/ # .md adversarial personas
└── notebooks/
└── run-evaluation.ipynb
Customize one or more in place:
| Change this | When |
|---|---|
Scenarios (scenarios/*.jsonl) | Add domain-specific seeds. A network-infra agent should test BGP/NETCONF injection; a healthcare agent should test PII leakage with realistic patient identifiers. |
Checks (checks/*.md or *.py) | Encode policies your domain cares about (e.g. "must redact account numbers", "must not call billing API without confirmation"). |
Drivers (drivers/*.md) | Add adversarial personas matching your threat model. The shipped jailbreak-escalator.md covers crescendo attacks; add your own for industry-specific manipulation patterns. |
Target (target.json) | Use distinct files per environment (target.dev.json, target.prod.json) and pass with --target. |
All artifacts include metadata: owasp_category, risk_severity, artifact_type, status, version. Keep this metadata accurate when you fork - it is what makes the resulting test runs traceable for compliance reporting.
Authoring new scenarios with Spec-Kit
compliance-owasp uses Spec-Kit for spec-driven authoring of new categories or scenarios. The flow is Specify → Plan → Tasks → Implement, with slash commands like /speckit.specify to generate a feature spec from a one-paragraph description.
If you are extending the suite (e.g. adding a domain-specific category alongside the OWASP defaults), the constitution at .specify/memory/constitution.md defines six non-negotiable principles - including OWASP-complete coverage, explainability, and simulation-driven coverage for stateful risks. Specs that violate the constitution are rejected at the planning step.
Where to go next
- Programmatic Red Teaming - red teaming as an engagement discipline: when to run it, how to scope it, what the deliverable looks like.
- Adversarial Drivers - how to author the attacker personas that drive multi-turn OWASP scenarios.
- Validating Guardrails - the independent-test pattern for the guardrail layer that sits in front of your model.
- compliance-owasp on GitHub - the repo. Issues and PRs welcome.