Skip to main content

OWASP LLM & Agentic Top 10 Testing

OWASP publishes two security frameworks that any team shipping generative AI to production should be measuring against:

Okareo maintains compliance-owasp, a first-party open-source repository that turns these standards into a runnable test suite. It covers all 20 categories (LLM01-LLM10, ASI01-ASI10) with 55 discrete test scenarios, adversarial driver personas, model-based and code-based checks, and reproducible Jupyter notebooks. The repo is the source of truth; Okareo is the deployment target.

Two-minute quick start

You can start testing your agent's OWASP compliance in under two minutes with the OWASP Compliance Quick Start. Fork the repo, point it at your agent endpoint, run python run_suite.py --dir LLM01-prompt-injection.

Why use compliance-owasp instead of writing it yourself

Most teams know they should be testing for prompt injection or excessive agency, but never get to it because the up-front cost is high: you need adversarial scenarios, attacker personas, and judges that know what "the agent leaked the system prompt" looks like. compliance-owasp ships those primitives, parameterized so you can fork once and customize per category.

What you getWhat's in the repo
Scenarios.jsonl seed inputs per category (5 for LLM01, 3 for LLM02, etc.)
Adversarial driversMarkdown persona prompts that simulate attackers across 5-10 conversational turns
ChecksModel-based judges (.md) for behavioral evaluation, code-based checks (.py) for deterministic rules
NotebooksOne run-evaluation.ipynb per category, idempotent, traceable
CLI runnerrun_suite.py uploads artifacts and runs the full evaluation in one command

OWASP LLM Top 10 (2025) coverage

IDCategorySeverityEval mode
LLM01Prompt InjectionCriticalSingle + multi-turn
LLM02Sensitive Information DisclosureCriticalSingle-turn
LLM03Supply Chain VulnerabilitiesHighSingle-turn
LLM04Data and Model PoisoningHighSingle-turn
LLM05Improper Output HandlingHighSingle-turn
LLM06Excessive AgencyCriticalMulti-turn
LLM07System Prompt LeakageHighSingle + multi-turn
LLM08Vector and Embedding WeaknessesHighSingle + multi-turn
LLM09MisinformationMediumSingle-turn
LLM10Unbounded ConsumptionMediumMulti-turn

OWASP Agentic AI Top 10 (2026) coverage

IDCategorySeverityEval mode
ASI01Agent Goal HijackCriticalMulti-turn
ASI02Tool Misuse and ExploitationCriticalMulti-turn
ASI03Identity and Privilege AbuseCriticalMulti-turn
ASI04Agentic Supply Chain VulnerabilitiesHighSingle-turn
ASI05Unexpected Code Execution (RCE)CriticalMulti-turn
ASI06Memory & Context PoisoningCriticalMulti-turn
ASI07Insecure Inter-Agent CommunicationHighSingle-turn
ASI08Cascading FailuresHighMulti-turn + trace bridge
ASI09Human-Agent Trust ExploitationHighMulti-turn
ASI10Rogue AgentsCriticalMulti-turn

Single-turn vs multi-turn: which pattern, when

The 20 OWASP categories split cleanly into two evaluation patterns. Picking the right one is the most important decision when adapting the suite to your agent.

Single-turn checks test a stateless property. One adversarial input goes in, one response comes out, and a check evaluates whether the response leaked, hallucinated, or generated unsafe output. Use single-turn for: PII exfiltration, credential leakage, output schema violations, hallucination, supply-chain validation. The OWASP suite uses single-turn evaluation for LLM02, LLM03, LLM04, LLM05, LLM09, ASI04, and ASI07.

Multi-turn simulations test a behavioral property that only emerges across conversation. An adversarial driver persona pushes the agent across 5-10 turns, attempting crescendo escalation, role manipulation, or progressive permission widening. Use multi-turn for: jailbreak chains, excessive agency, iterative system-prompt extraction, goal hijack, tool misuse, memory poisoning. Most ASI categories require multi-turn because agentic risks are inherently temporal: the failure pattern is "the agent did the wrong thing on turn 7 because of what happened on turn 3."

LLM01 (prompt injection) and LLM07 (system prompt leakage) use both: single-turn for the obvious blunt attacks, multi-turn for the patient ones.

See the Multi-Turn Simulation overview for the underlying simulation primitive, and Adversarial Drivers for how to author the attacker personas.

Run a category: fork, point, run

# 1. Fork and clone
git clone https://github.com/<your-org>/compliance-owasp.git
cd compliance-owasp

# 2. Configure
cp owasp/config.env.example .env
cp owasp/target.json.example owasp/target.json
# Edit .env with OKAREO_API_KEY
# Edit owasp/target.json with your agent's endpoint, request body, response path

# 3. Install
uv sync

# 4. Run a category end-to-end
uv run python run_suite.py --dir LLM01-prompt-injection

run_suite.py uploads scenarios, checks, and drivers to your Okareo workspace, then runs evaluations. Re-running is idempotent. Common flags:

FlagUse
--dir LLM06-excessive-agencyRequired. Which category to run.
--max-turns 8Override default turn count for multi-turn simulations.
--sim iterative-extractionRun only simulations whose name matches this substring.
--upload-onlyPush artifacts without running.
--eval-onlyRun against artifacts already uploaded.
--target owasp/target.prod.jsonUse a different target config (e.g. for a separate environment).

You can also run from the per-category notebook (owasp/LLM01-prompt-injection/notebooks/run-evaluation.ipynb) if you prefer an interactive surface.

ASI08: bridging live traces to simulation

The ASI08 (Cascading Failures) category includes a trace-simulation bridge that links live OpenTelemetry traces from your production agent to simulation runs via shared context_token and session.id fields. This is how you take a real failure observed in production and turn it into a reproducible simulation scenario, with the original trace attached as evidence.

# Configure
cp owasp/target.json.trace-bridge-example owasp/target.json

# Upload artifacts and run the bridge simulation
uv run python run_suite.py --dir ASI08-cascading-failures \
--sim pipeline-cascade-failure --max-turns 8

# Continue in the trace-bridge notebook for OTEL ingestion
# and online datapoint evaluation:
# owasp/ASI08-cascading-failures/notebooks/run-trace-bridge-evaluation.ipynb

When a selected simulation declares "requires_monitor": true in eval_config.json, run_suite.py automatically creates (or reuses) a single category-scoped Monitor instead of one per session. Monitor checks for ASI08 are scoped to ASI08-cascade-failure-trace-bridge-detector. See Monitoring for the underlying primitive.

Customizing for your domain

Each category has the same shape:

owasp/LLM01-prompt-injection/
├── scenarios/ # .jsonl seed inputs
├── checks/ # .md (model-based) or .py (code-based)
├── drivers/ # .md adversarial personas
└── notebooks/
└── run-evaluation.ipynb

Customize one or more in place:

Change thisWhen
Scenarios (scenarios/*.jsonl)Add domain-specific seeds. A network-infra agent should test BGP/NETCONF injection; a healthcare agent should test PII leakage with realistic patient identifiers.
Checks (checks/*.md or *.py)Encode policies your domain cares about (e.g. "must redact account numbers", "must not call billing API without confirmation").
Drivers (drivers/*.md)Add adversarial personas matching your threat model. The shipped jailbreak-escalator.md covers crescendo attacks; add your own for industry-specific manipulation patterns.
Target (target.json)Use distinct files per environment (target.dev.json, target.prod.json) and pass with --target.

All artifacts include metadata: owasp_category, risk_severity, artifact_type, status, version. Keep this metadata accurate when you fork - it is what makes the resulting test runs traceable for compliance reporting.

Authoring new scenarios with Spec-Kit

compliance-owasp uses Spec-Kit for spec-driven authoring of new categories or scenarios. The flow is Specify → Plan → Tasks → Implement, with slash commands like /speckit.specify to generate a feature spec from a one-paragraph description.

If you are extending the suite (e.g. adding a domain-specific category alongside the OWASP defaults), the constitution at .specify/memory/constitution.md defines six non-negotiable principles - including OWASP-complete coverage, explainability, and simulation-driven coverage for stateful risks. Specs that violate the constitution are rejected at the planning step.

Where to go next