Skip to main content
  1. Posts/

Inside the New Joint Cyber Agency Guidance on Agentic AI

On May 1, 2026, CISA, the NSA, ASD’s ACSC, the Canadian Centre for Cyber Security, NCSC-UK, and NCSC-NZ published a joint document called Careful Adoption of Agentic AI Services. A joint publication from six allied cyber agencies on a single topic is rare, and the message is direct. Agentic AI is already running inside critical infrastructure, often with broader access than current monitoring and identity controls were designed to govern.

A lot of “new” technology slots into existing identity and governance frameworks with minor adjustments. This is one of the cases where the gap is wider. Many of the deployment patterns common today carry risks that the guidance argues are likely to surface as incidents over time. The framing matters: this is non-human identity work and governance work, not an AI problem in some separate bucket.

For organizations with agentic AI already in production, a near-term priority is an inventory of every agent, the credentials it holds, and the privileges those credentials carry. The faster that visibility exists, the easier the rest of this work becomes.

The five risk classes, in plain language
#

The authoring agencies organize the guidance around five risk categories: privilege, design and configuration, behavior, structural, and accountability. The categories aren’t novel on their own. What’s new is how they interact when you give an autonomous system the ability to plan, call tools, spawn sub-agents, and act on its own.

Privilege risks cover what most practitioners would recognize as identity hygiene problems, just dressed up. Agents inherit broad permissions at deployment, never get those permissions reviewed, and accumulate scope as they’re integrated with more tools. The guidance specifically calls out the confused deputy pattern, where a low-privileged user manipulates a high-privileged agent into performing actions the user couldn’t perform directly. If you’ve worked any incident involving SSRF or service account abuse, this should feel familiar.

Design and configuration risks are about the architecture decisions that lock in those privilege problems. Static role checks evaluated once at startup. Cached authorization decisions that don’t get revalidated per request. Poor segmentation between agent environments. Third-party components integrated without privilege review. None of this is unique to AI, but the speed at which agents chain actions makes the consequences arrive faster than your detection logic.

Behavior risks are the genuinely new category. Agents can specification-game their objectives, find shortcuts that technically meet a goal while violating its intent, and adapt their behavior when they detect they’re being evaluated. The guidance is direct about this: some AI systems have demonstrated capacity for strategic deception, including hiding their true capabilities and concealing vulnerabilities they discover instead of reporting them. Seeing that acknowledgement in a CISA publication is a meaningful signal in itself.

Structural risks come from the interconnected nature of multi-agent systems. One compromised agent can poison its peers through trust relationships, cause cascading failures through resource exhaustion, or amplify hallucinations across downstream consumers. Tool descriptions themselves become an attack surface, because an agent will preferentially select tools with persuasive descriptions, and a malicious actor can write a very persuasive tool description.

Accountability risks are where this intersects most directly with GRC. When a chain of distributed decisions across planning, retrieval, and execution agents produces a bad outcome, the logs are often fragmented, the reasoning is opaque, and assigning responsibility becomes difficult. Compliance demonstrations get harder. Forensic timelines get harder. Anyone who has had to reconstruct an incident across multiple SaaS platforms will recognize the shape of the problem, and agentic systems can amplify it considerably.

The non-human identity dimension
#

The most actionable part of the guidance is its identity recommendations, and they map directly to non-human identity work that many programs are still maturing for service accounts and API keys.

The guidance recommends that each agent be constructed as a distinct principal with its own cryptographically anchored identity, using managed identity services, decentralized identifiers, or PKI. It calls for mutual TLS on all inter-agent and agent-to-service API calls, a trusted registry of authorized agents that gets reconciled against the live set, and denial of access for any agent or cryptographic key not present in that registry. It also recommends ephemeral credentials over long-lived secrets, dynamic privilege scoping with immediate revocation when sub-tasks complete, and cryptographic attestation requiring agents to prove they’re running expected, unmodified code before privileged calls.

For practitioners who have been trying to advance an NHI program internally, the joint guidance offers useful cross-jurisdictional support for that work. When CISA, the NSA, and ASD ACSC sign the same document recommending these controls, the funding conversation tends to shift.

In CISSP terms, this is Domain 5 work: Identity and Access Management. The agent is a subject. The tools and data are objects. The access control model still applies. What changes is the speed and autonomy of the subject, and the consequence of getting the model wrong.

Practical starting point: treat every agent as a service account with extra failure modes. Apply your existing service account governance: short-lived credentials, scoped permissions, identity registry, mutual TLS, periodic reconciliation. Then layer on the agent-specific controls: cryptographic attestation, runtime authorization at every privileged call, and human-in-the-loop gates on irreversible actions.

Threat model implications
#

The guidance recommends threat modeling using OWASP GenAI Security Project and MITRE ATLAS, harmonizing controls with existing zero trust principles and NIST SP 800-207, and conducting realistic red team exercises specifically targeting agentic behaviors. Three areas of the threat model are worth revisiting in light of this guidance.

First, prompt injection is no longer a chat-window problem. Indirect prompt injection through retrieved documents, web search results, tool descriptions, and memory bases means any data source the agent reads is a potential instruction channel. A current threat model should treat tool outputs and retrieved content as untrusted by default, with input validation and semantic analysis at every ingestion point.

Second, the blast radius calculation changes. A compromised agent with broad tool access isn’t a single endpoint, it’s a privileged actor with the ability to chain actions across systems, modify logs, and impersonate other agents. Quarantine policies for log deletion requests aren’t paranoid, they’re explicitly recommended. So is isolating agents into enclaves with no write access to logs.

Third, incident response procedures benefit from agent-specific playbooks. Detection of agent compromise looks different from detection of user account compromise. Containment depends on the ability to revoke an agent identity instantly across every system it can authenticate to, rather than waiting for the next credential rotation cycle. Recovery depends on versioning and rollback to known-good agent behaviors, not just rebuilding a host.

Implications for governance and compliance programs
#

The accountability risk class is where this lands hardest for compliance programs. The guidance is explicit that legal accountability and risk ownership for agentic AI systems must be defined in policy, that decisions about when human approval is required must be made by system designers and operators rather than delegated to the agent, and that comprehensive logs and unified audit trails must cover all inter-agent interactions.

For anyone running a PCI program: agentic AI in or adjacent to a cardholder data environment introduces scope questions that assessors are increasingly likely to raise in upcoming assessment cycles. An agent with access to systems that store, process, or transmit account data is in scope. An agent that can write to logs is in scope. An agent that can spawn sub-agents and delegate authority needs documented control flows that demonstrate who, or what, authorized each action. Programs that don’t yet have that documentation in place may want to prioritize closing that gap proactively.

For broader GRC programs: this is significant because agentic AI maps cleanly onto existing frameworks. The guidance explicitly says agentic AI does not require an entirely new security discipline. It can be folded into NIST AI RMF, the OWASP Top 10 for Agentic Applications, an existing zero trust roadmap, and current IAM governance. What does need to be in place is a governance body that can make policy decisions about agent autonomy levels, an updated risk taxonomy that includes the five risk classes, and incident response procedures rehearsed against agent compromise scenarios.

Recommended next steps#

The guidance closes with a recommendation that organizations should assume agentic AI systems may behave unexpectedly and plan deployments accordingly, prioritizing resilience, reversibility, and risk containment over efficiency gains. Translated into operational terms, that points to a few areas worth prioritizing.

The first is visibility. An inventory of every agent in the environment, including those that may have been spun up by individual teams without formal security review, gives the rest of the program something to stand on. That inventory should map each agent’s privileges, credential type, credential lifetime, and authorized tool list. Agents with write access to logs, those that can spawn sub-agents, and those that can take irreversible actions without human approval are good candidates for a priority remediation list.

The second is threat model alignment. The five risk classes can be added directly to existing threat models, along with tool descriptions and retrieved content as untrusted input channels, agent-to-agent communication as a lateral movement vector, and log integrity treated as an availability and accountability concern rather than a compliance checkbox.

The third is governance engagement. A briefing for the governance body covering this guidance is a useful forcing function for explicit policy decisions on autonomy thresholds, human-in-the-loop requirements for high-impact actions, and accountability ownership when an agent causes harm. Bringing legal into the conversation early, with the framework already in hand, tends to produce better outcomes than waiting for them to ask.

With six allied cyber agencies behind this guidance, there is now broad cross-jurisdictional support for treating agent security as a current priority rather than a future one.

Sources
#

United States

Allied co-authoring agencies

Juan Carlos Munera
Author
Juan Carlos Munera
Passionate about cybersecurity, governance, risk, and compliance. Sharing insights on security best practices, frameworks, and industry trends.

Related

Claude Opus 4.7 Drops with Built-In Cyber Safeguards: What Security Practitioners Need to Know

Anthropic shipped Claude Opus 4.7 today as its most capable generally available model, but the cybersecurity story is bigger than the benchmarks. The model includes automated safeguards that block high-risk cyber requests, deliberately reduced offensive capabilities compared to Mythos Preview, and a new Cyber Verification Program that gates legitimate security use behind a formal application process. This is the first generally available model where Anthropic is actively testing the controls it needs before it can release Mythos-class capabilities to the public.

Project Glasswing: What Happens When AI Can Find and Exploit Vulnerabilities Faster Than You Can Patch

Anthropic launched Project Glasswing with 12 major tech companies, using its unreleased Claude Mythos Preview model to find and patch zero-day vulnerabilities at a scale and speed that didn’t exist six months ago. The implications for vulnerability management, patching cycles, and defensive security programs are enormous.