OWASP LLM & Agentic Top 10 Testing

OWASP publishes two security frameworks that any team shipping generative AI to production should be measuring against:

OWASP Top 10 for LLM Applications (2025) - the canonical list of LLM-specific risks, from prompt injection to unbounded consumption.
OWASP Top 10 for Agentic Applications (2026) - the newer list covering risks unique to autonomous, tool-using agents: goal hijack, tool misuse, identity privilege abuse, cascading failures, and more.

Okareo maintains compliance-owasp, a first-party open-source repository that turns these standards into a runnable test suite. It covers all 20 categories (LLM01-LLM10, ASI01-ASI10) with 55 discrete test scenarios, adversarial driver personas, model-based and code-based checks, and reproducible Jupyter notebooks. The repo is the source of truth; Okareo is the deployment target.

Two-minute quick start

You can start testing your agent's OWASP compliance in under two minutes with the OWASP Compliance Quick Start. Fork the repo, point it at your agent endpoint, run python run_suite.py --dir LLM01-prompt-injection.

Why use compliance-owasp instead of writing it yourself

Most teams know they should be testing for prompt injection or excessive agency, but never get to it because the up-front cost is high: you need adversarial scenarios, attacker personas, and judges that know what "the agent leaked the system prompt" looks like. compliance-owasp ships those primitives, parameterized so you can fork once and customize per category.

What you get	What's in the repo
Scenarios	`.jsonl` seed inputs per category (5 for LLM01, 3 for LLM02, etc.)
Adversarial drivers	Markdown persona prompts that simulate attackers across 5-10 conversational turns
Checks	Model-based judges (`.md`) for behavioral evaluation, code-based checks (`.py`) for deterministic rules
Notebooks	One `run-evaluation.ipynb` per category, idempotent, traceable
CLI runner	`run_suite.py` uploads artifacts and runs the full evaluation in one command

OWASP LLM Top 10 (2025) coverage

ID	Category	Severity	Eval mode
LLM01	Prompt Injection	Critical	Single + multi-turn
LLM02	Sensitive Information Disclosure	Critical	Single-turn
LLM03	Supply Chain Vulnerabilities	High	Single-turn
LLM04	Data and Model Poisoning	High	Single-turn
LLM05	Improper Output Handling	High	Single-turn
LLM06	Excessive Agency	Critical	Multi-turn
LLM07	System Prompt Leakage	High	Single + multi-turn
LLM08	Vector and Embedding Weaknesses	High	Single + multi-turn
LLM09	Misinformation	Medium	Single-turn
LLM10	Unbounded Consumption	Medium	Multi-turn

OWASP Agentic AI Top 10 (2026) coverage

ID	Category	Severity	Eval mode
ASI01	Agent Goal Hijack	Critical	Multi-turn
ASI02	Tool Misuse and Exploitation	Critical	Multi-turn
ASI03	Identity and Privilege Abuse	Critical	Multi-turn
ASI04	Agentic Supply Chain Vulnerabilities	High	Single-turn
ASI05	Unexpected Code Execution (RCE)	Critical	Multi-turn
ASI06	Memory & Context Poisoning	Critical	Multi-turn
ASI07	Insecure Inter-Agent Communication	High	Single-turn
ASI08	Cascading Failures	High	Multi-turn + trace bridge
ASI09	Human-Agent Trust Exploitation	High	Multi-turn
ASI10	Rogue Agents	Critical	Multi-turn

Single-turn vs multi-turn: which pattern, when

The 20 OWASP categories split cleanly into two evaluation patterns. Picking the right one is the most important decision when adapting the suite to your agent.

Single-turn checks test a stateless property. One adversarial input goes in, one response comes out, and a check evaluates whether the response leaked, hallucinated, or generated unsafe output. Use single-turn for: PII exfiltration, credential leakage, output schema violations, hallucination, supply-chain validation. The OWASP suite uses single-turn evaluation for LLM02, LLM03, LLM04, LLM05, LLM09, ASI04, and ASI07.

Multi-turn simulations test a behavioral property that only emerges across conversation. An adversarial driver persona pushes the agent across 5-10 turns, attempting crescendo escalation, role manipulation, or progressive permission widening. Use multi-turn for: jailbreak chains, excessive agency, iterative system-prompt extraction, goal hijack, tool misuse, memory poisoning. Most ASI categories require multi-turn because agentic risks are inherently temporal: the failure pattern is "the agent did the wrong thing on turn 7 because of what happened on turn 3."

LLM01 (prompt injection) and LLM07 (system prompt leakage) use both: single-turn for the obvious blunt attacks, multi-turn for the patient ones.

See the Multi-Turn Simulation overview for the underlying simulation primitive, and Adversarial Drivers for how to author the attacker personas.

Run a category: fork, point, run

# 1. Fork and clone
git clone https://github.com/<your-org>/compliance-owasp.git
cd compliance-owasp

# 2. Configure
cp owasp/config.env.example .env
cp owasp/target.json.example owasp/target.json
# Edit .env with OKAREO_API_KEY
# Edit owasp/target.json with your agent's endpoint, request body, response path

# 3. Install
uv sync

# 4. Run a category end-to-end
uv run python run_suite.py --dir LLM01-prompt-injection

run_suite.py uploads scenarios, checks, and drivers to your Okareo workspace, then runs evaluations. Re-running is idempotent. Common flags:

Flag	Use
`--dir LLM06-excessive-agency`	Required. Which category to run.
`--max-turns 8`	Override default turn count for multi-turn simulations.
`--sim iterative-extraction`	Run only simulations whose name matches this substring.
`--upload-only`	Push artifacts without running.
`--eval-only`	Run against artifacts already uploaded.
`--target owasp/target.prod.json`	Use a different target config (e.g. for a separate environment).

You can also run from the per-category notebook (owasp/LLM01-prompt-injection/notebooks/run-evaluation.ipynb) if you prefer an interactive surface.

ASI08: bridging live traces to simulation

The ASI08 (Cascading Failures) category includes a trace-simulation bridge that links live OpenTelemetry traces from your production agent to simulation runs via shared context_token and session.id fields. This is how you take a real failure observed in production and turn it into a reproducible simulation scenario, with the original trace attached as evidence.

# Configure
cp owasp/target.json.trace-bridge-example owasp/target.json

# Upload artifacts and run the bridge simulation
uv run python run_suite.py --dir ASI08-cascading-failures \
  --sim pipeline-cascade-failure --max-turns 8

# Continue in the trace-bridge notebook for OTEL ingestion
# and online datapoint evaluation:
# owasp/ASI08-cascading-failures/notebooks/run-trace-bridge-evaluation.ipynb

When a selected simulation declares "requires_monitor": true in eval_config.json, run_suite.py automatically creates (or reuses) a single category-scoped Monitor instead of one per session. Monitor checks for ASI08 are scoped to ASI08-cascade-failure-trace-bridge-detector. See Monitoring for the underlying primitive.

Customizing for your domain

Each category has the same shape:

owasp/LLM01-prompt-injection/
├── scenarios/         # .jsonl seed inputs
├── checks/            # .md (model-based) or .py (code-based)
├── drivers/           # .md adversarial personas
└── notebooks/
    └── run-evaluation.ipynb

Customize one or more in place:

Change this	When
Scenarios (`scenarios/*.jsonl`)	Add domain-specific seeds. A network-infra agent should test BGP/NETCONF injection; a healthcare agent should test PII leakage with realistic patient identifiers.
Checks (`checks/.md` or `.py`)	Encode policies your domain cares about (e.g. "must redact account numbers", "must not call billing API without confirmation").
Drivers (`drivers/*.md`)	Add adversarial personas matching your threat model. The shipped `jailbreak-escalator.md` covers crescendo attacks; add your own for industry-specific manipulation patterns.
Target (`target.json`)	Use distinct files per environment (`target.dev.json`, `target.prod.json`) and pass with `--target`.

All artifacts include metadata: owasp_category, risk_severity, artifact_type, status, version. Keep this metadata accurate when you fork - it is what makes the resulting test runs traceable for compliance reporting.

Authoring new scenarios with Spec-Kit

compliance-owasp uses Spec-Kit for spec-driven authoring of new categories or scenarios. The flow is Specify → Plan → Tasks → Implement, with slash commands like /speckit.specify to generate a feature spec from a one-paragraph description.

If you are extending the suite (e.g. adding a domain-specific category alongside the OWASP defaults), the constitution at .specify/memory/constitution.md defines six non-negotiable principles - including OWASP-complete coverage, explainability, and simulation-driven coverage for stateful risks. Specs that violate the constitution are rejected at the planning step.

Where to go next

Programmatic Red Teaming - red teaming as an engagement discipline: when to run it, how to scope it, what the deliverable looks like.
Adversarial Drivers - how to author the attacker personas that drive multi-turn OWASP scenarios.
Validating Guardrails - the independent-test pattern for the guardrail layer that sits in front of your model.
compliance-owasp on GitHub - the repo. Issues and PRs welcome.

Why use compliance-owasp instead of writing it yourself​

OWASP LLM Top 10 (2025) coverage​

OWASP Agentic AI Top 10 (2026) coverage​

Single-turn vs multi-turn: which pattern, when​

Run a category: fork, point, run​

ASI08: bridging live traces to simulation​

Customizing for your domain​

Authoring new scenarios with Spec-Kit​

Where to go next​