Skip to main content
  1. Posts/

AI Agents Are the New Insider Threat: Breaking Down the OWASP Agentic Top 10

The Shift Nobody Was Ready For
#

For years, securing AI meant worrying about prompt injection and data leakage in chatbots. Annoying? Sure. But the blast radius was limited. A chatbot says something it shouldn’t, you clean it up and move on.

That era is over.

AI agents in 2026 aren’t answering questions. They’re taking actions. They call APIs, query databases, execute code, modify files, trigger workflows, and chain decisions across multiple systems. Often with minimal human oversight. IBM’s Agentic AI Security Guide, published earlier this month, put it well: AI agents should be treated as “digital insiders” whose risk profile looks a lot more like an insider threat than a software vulnerability.

And we’re deploying them at a staggering pace. Cisco’s State of AI Security 2026 report found that 83% of organizations planned to deploy agentic AI into business functions. Only 29% felt they were ready to do it securely. The Gravitee State of AI Agent Security 2026 report paints an even sharper picture: 88% of organizations reported confirmed or suspected AI agent security incidents in the past year, yet only 14.4% have full security and IT approval for their entire agent fleet.

Read that last number again. 14.4%. The rest is shadow AI running in production.

Enter the OWASP Agentic Top 10
#

In December 2025, the OWASP GenAI Security Project released the Top 10 for Agentic Applications. This isn’t a minor update to the existing LLM Top 10. It’s a completely separate framework built specifically for autonomous AI systems that plan, decide, and act across complex workflows. Over 100 security researchers, industry practitioners, and representatives from organizations like NIST, AWS, Microsoft, and the Alan Turing Institute contributed to its development.

The framework introduces a key design principle that should sound familiar to anyone who has worked in systems security: least agency. Only grant agents the minimum autonomy required to perform safe, bounded tasks. If that reminds you of least privilege from traditional IAM, it should. The difference is that we’re now applying it to systems that can dynamically select their own tools and decide their own next steps.

Here’s the full list:

IDRiskDescription
ASI01Agent Goal HijackAttackers redirect agent objectives through poisoned inputs
ASI02Tool Misuse & ExploitationAgents misuse legitimate tools due to prompt manipulation or unsafe delegation
ASI03Identity & Privilege AbuseExploitation of inherited credentials, cached tokens, or agent-to-agent trust
ASI04Agentic Supply Chain VulnerabilitiesCompromised tools, model files, or MCP server definitions
ASI05Unexpected Code ExecutionAgents generating or executing untrusted or attacker-controlled code
ASI06Memory & Context PoisoningPersistent corruption of agent memory, RAG stores, or contextual knowledge
ASI07Insecure Inter-Agent CommunicationSpoofed, intercepted, or manipulated messages between agents
ASI08Cascading FailuresFalse signals propagating through automated pipelines with escalating impact
ASI09Human-Agent Trust ExploitationPolished, confident agent outputs misleading human operators into approving harmful actions
ASI10Rogue AgentsMisaligned or compromised agents diverging from intended behavior

Rather than walk through all ten (the OWASP documentation is excellent and freely available), I want to focus on the ones I think have the most immediate practical impact for security teams.

ASI01: Agent Goal Hijack
#

This one sits at the top of the list for good reason. Agent goal hijacking occurs when an attacker manipulates an agent’s objectives by injecting malicious instructions into data the agent processes. This could be a poisoned email, a crafted PDF, a meeting invite, or a document sitting in a RAG pipeline.

The fundamental problem is that agents can’t reliably separate instructions from data. A single poisoned document in a retrieval pipeline can redirect an agent from summarizing files to exfiltrating them. The EchoLeak attack demonstrated exactly this pattern, turning copilots into silent exfiltration engines through hidden prompts.

This isn’t theoretical. State-sponsored actors are already exploiting this at scale. Cisco’s report documents a China-linked group that automated 80 to 90 percent of an attack chain by jailbreaking an AI coding assistant and directing it to scan ports, identify vulnerabilities, and develop exploit scripts.

What to do about it:

  • Treat all natural language input as untrusted, including data the agent retrieves from internal sources
  • Implement strict input validation and content filtering before agent processing
  • Limit tool privileges so that even if an agent’s goal is hijacked, the blast radius is contained
  • Monitor for behavioral anomalies in agent actions, not just the prompts themselves

ASI03: Identity & Privilege Abuse
#

This is the one that keeps me up at night from an architecture perspective.

Traditional IAM was built for humans and service accounts. We understand how to scope permissions for a user who logs in, performs a set of predictable actions, and logs out. AI agents break that model in several ways. They inherit permissions from their deployers. They cache tokens across sessions. They dynamically delegate tasks to other agents. And in multi-agent systems, trust relationships between agents can be exploited just like trust relationships between services.

Consider this scenario from the OWASP documentation: a compromised research agent inserts hidden instructions into its output. That output gets consumed by a financial agent, which then executes unintended trades using its own (legitimate) credentials. The research agent never needed direct access to the trading system. It just needed to poison the pipeline.

The average enterprise now faces an 82:1 machine-to-human identity ratio according to Palo Alto Networks. Add autonomous AI agents to that mix, each with their own credentials, inherited permissions, and dynamic trust relationships, and you start to see why traditional IAM isn’t going to cut it.

What to do about it:

  • Treat agent identities as first-class non-human identities with their own lifecycle management
  • Implement runtime authorization, not just static OAuth scopes. Permissions need to be contextual and evaluated per-action
  • Audit every credential an agent holds and ask whether it actually needs it
  • Design for revocability. You need a kill switch that can immediately sever an agent’s access

ASI04: Agentic Supply Chain Vulnerabilities
#

If you’ve been paying attention to software supply chain security over the past few years, this risk category will feel familiar, but with a twist. The agentic supply chain isn’t just npm packages and container images. It includes MCP server definitions, tool descriptors, agent personas, model files, and prompt templates.

The Barracuda Security report identified 43 different agent framework components with embedded vulnerabilities introduced through supply chain compromise. Model files hosted on open-source repositories can contain executable code that runs during loading. A GitHub MCP exploit demonstrated how easily runtime components could be poisoned. And the rush to adopt frameworks without security review is creating exactly the kind of exposure that supply chain attackers love.

A supply chain attack targeting Cline users was recently caught installing backdoored AI tools through a compromised package. Meanwhile, researchers have demonstrated that injecting just 250 poisoned documents into training data can implant backdoors that activate under specific trigger phrases while leaving general performance completely intact.

What to do about it:

  • Maintain an allowlist of approved versions for all agent framework components
  • Verify components against official security bulletins, not just git repositories
  • Pin versions and use signed manifests for MCP servers, plugins, and prompt templates
  • Treat every new tool definition and model file as untrusted until verified

ASI09: Human-Agent Trust Exploitation
#

This one doesn’t get enough attention. AI agents produce outputs that are articulate, confident, and well-structured. Humans tend to trust confident, articulate communication. This creates a dangerous dynamic where a compromised or malfunctioning agent can present harmful recommendations in a way that looks completely reasonable, and the human operator approves them without sufficient scrutiny.

The OWASP framework explicitly calls this out as a distinct risk category because it targets the human governance layer. All the technical controls in the world don’t help if the human in the loop rubber-stamps everything the agent suggests because the agent “sounds right.”

This is where organizational culture and training matter as much as technology. Security teams need to educate operators on what healthy skepticism of agent outputs looks like, especially for high-impact decisions involving financial transactions, data access changes, or production deployments.

What This Means for Security Teams Right Now
#

If your organization is deploying AI agents (or plans to), here’s where I’d focus:

Get visibility first. You can’t secure what you can’t see. Inventory every agent operating in your environment, including the ones that were spun up by enthusiastic developers without going through security review. Remember that 14.4% number. Odds are good that shadow AI agents are already running in your org.

Extend your IAM strategy. Your identity governance needs to treat AI agents as a new class of non-human identity. This means lifecycle management, least privilege enforcement, runtime authorization, and the ability to revoke access immediately.

Secure the supply chain. Audit all MCP servers, plugins, prompt templates, and model files in use. Use pinned versions and signed artifacts. Don’t assume that because a framework is popular on GitHub, it’s safe.

Rethink your monitoring. Traditional SIEM isn’t built for agentic threats. You need behavioral analytics that understand agent action patterns and can flag deviations in real time. Log every decision, tool call, and state change.

Require human approval for critical actions. Payments, data deletion, credential access, and production deployments should always require explicit human sign-off regardless of how confident the agent appears. Build that into the architecture, not as an afterthought.

Run red team exercises specifically targeting your agents. Test for prompt injection, tool misuse, privilege escalation, and memory poisoning. You’ll almost certainly find that your agents are more vulnerable than you expected.

The Bigger Picture
#

The OWASP Agentic Top 10 represents a genuine inflection point. Just as the original OWASP Top 10 gave the industry a shared language for web application security risks, this framework gives us a starting point for reasoning about autonomous AI risks before they scale further.

But frameworks only matter if people actually use them. The gap between AI adoption speed and security readiness is widening, not closing. If you’re in a security or GRC role, now is the time to bring this framework into your risk conversations, your architecture reviews, and your board-level discussions about AI strategy.

The agents are already running. The question is whether your security program is keeping up.


Further Reading:

Juan Carlos Munera
Author
Juan Carlos Munera
Passionate about cybersecurity, governance, risk, and compliance. Sharing insights on security best practices, frameworks, and industry trends.

Related

Quantum Won't Kill Encryption. It Never Has.

If you’ve spent any time on LinkedIn or at a cybersecurity conference in the last couple of years, you’ve seen the headlines. “Quantum computing will break all encryption.” “Your data is already at risk.” “The cryptographic apocalypse is coming.” It makes for great conference talks and even better vendor marketing. But here’s the thing: encryption has always been broken. And every single time, we’ve replaced it with something stronger. The lifecycle of cryptographic algorithms isn’t a flaw in the system; it is the system. So why would quantum computing be any different?