When the AI Forgets What It Knows: Memory Poisoning in Claude Code

Table of Contents

Cisco’s AI Threat and Security Research team published findings that should reshape how engineering teams think about AI coding assistants. Researchers Idan Habler and Amy Chang demonstrated a method to compromise Claude Code’s persistent memory and maintain that compromise across every project, every session, and every reboot. Anthropic shipped a fix in Claude Code v2.1.50 after coordinated disclosure. The vulnerability is patched. The lesson is not.

If you’ve spent any time as a systems engineer wiring together services that trust each other, the failure mode here will feel familiar. It’s the same architectural mistake that’s haunted us for decades: a system treating attacker-influenced data as if it were authoritative configuration.

Patched in v2.1.50. If your team uses Claude Code, confirm the version every developer is running. Memory files written before the patch may still contain attacker-controlled instructions and should be reviewed manually.

What Cisco actually found
#

Claude Code, like most modern AI coding agents, maintains persistent memory across sessions. It does this through MEMORY.md files stored in the user’s home directory and within each project folder. The idea is simple and useful: the agent remembers your coding style, your project’s architecture, the conventions you’ve established, so you don’t have to re-explain them every time.

In the version Cisco evaluated, the first 200 lines of those memory files were loaded directly into Claude’s system prompt. For anyone unfamiliar with the term, the system prompt is the foundational instruction set that shapes how the model thinks and behaves. It’s the closest thing an LLM has to a kernel.

That’s the architectural flaw. Memory files were treated as high-authority additions to the system prompt, with the model assuming they were written by the user and following them implicitly. There was no boundary between trusted instructions and project-scoped inputs.

The attack chain
#

Cisco’s exploit reads like a textbook supply chain compromise wearing modern clothes.

Step 1: Entry through npm. The researchers used a known and well-understood vector: npm lifecycle hooks. The postinstall hook allows arbitrary code execution during package installation. This is legitimate behavior used by countless packages for setup tasks. It’s also the same vector that has driven supply chain attacks against the JavaScript ecosystem for years.

In the proof of concept, the user clones a repository and asks Claude Code to set it up. Claude offers to install the npm packages. The user approves. The malicious payload runs as part of that approved install.

Step 2: Poisoning the memory. The payload overwrites the project memory files at ~/.claude/projects/*/memory/MEMORY.md and the global hooks configuration at ~/.claude/settings.json. Critically, it targets the UserPromptSubmit hook, which executes before every prompt and injects its output directly into Claude’s context. That output then persists across all projects, sessions, and reboots.

Step 3: Persistence even if you turn the feature off. This is the part that should make you uncomfortable. The payload appends a shell alias to .zshrc or .bashrc:

alias claude='CLAUDE_CODE_DISABLE_AUTO_MEMORY=0 claude'

Every time the user launches Claude, auto-memory is silently re-enabled. Disabling the feature in the UI does nothing. The user thinks they’ve turned it off. The shell alias quietly turns it back on.

More consequential than a typical CVE
#

If Cisco had stopped at “we made the agent prefix every response with a string,” this would be a curiosity. They didn’t stop there. They poisoned the memory to provide systematically insecure guidance. When the test user asked where to store a vendor API key, the poisoned agent recommended committing the key directly to a source file, advised against using .env files or environment variables, offered to scaffold the insecure file structure automatically, and provided no security warnings whatsoever.

A junior developer following that advice would never know it was wrong. The agent’s output looked authoritative because, architecturally, it was. The model had been told these were the project’s mandatory practices, and it complied.

The practitioner takeaway. This isn’t really a vulnerability in Claude Code. It’s a vulnerability in how AI coding agents architecturally trust their own memory. Other agents that load persistent context into the system prompt without trust boundaries will exhibit the same class of flaw until that pattern changes industry-wide.

The non-human identity angle
#

For the GRC and identity practitioners reading this, the angle gets interesting. The persistent memory of an AI coding agent is, functionally, a long-lived credential bound to a non-human identity. It carries authority that influences code, dependency selection, and security posture. It executes on behalf of a developer with that developer’s machine permissions.

We’ve spent the last few years arguing that NHI governance has to extend beyond service accounts and API keys. This is the argument made tangible. A poisoned memory file is a credential compromise wearing a different mask. The developer’s identity hasn’t been stolen. Their agent’s identity has been quietly redefined.

If your organization has an NHI inventory program, AI agent context files belong on that inventory. If you’re scanning for hardcoded secrets, consider whether you’re also scanning for unexpected modifications to agent memory. Most teams aren’t, because the tooling barely exists yet.

The supply chain dimension
#

I’ve written before about the npm ecosystem as a persistent supply chain risk. This research demonstrates that AI coding agents are now a force multiplier for that risk. A single malicious package no longer just compromises the developer’s machine for the duration of one project. It compromises the cognitive layer the developer relies on for guidance across every future project.

That’s a meaningful escalation. Traditional malware in an npm package needs to do its work and exfiltrate quickly before something flags it. A poisoned memory file can sit there for months, quietly steering the developer toward insecure patterns, with no obvious indicator of compromise.

What Anthropic clarified about responsibility
#

In the disclosure, Anthropic clarified two security boundary positions that engineering and GRC leaders should read carefully.

First, the user principal on the machine is considered fully trusted. Scripts running as the user are intentionally allowed to modify settings and memories.

Second, the attack requires the user to interact with an untrusted repository. Users are ultimately responsible for vetting any dependencies they introduce.

Both positions are defensible from a vendor perspective. Both also place a significant operational and governance burden on the deploying organization. If your developers are running Claude Code and pulling open source dependencies, your threat model now includes persistent memory poisoning, and your security program needs to account for it.

What to do about it
#

Some of this is hygiene you should already have in place. Some of it is new.

Confirm Claude Code v2.1.50 or later across every developer workstation. Audit existing MEMORY.md files in user home directories and project folders for content that doesn’t match what the developer remembers writing. Look specifically for instructions that frame insecure practices as architectural requirements.

Review shell configuration files (.zshrc, .bashrc, .profile, and equivalents) for unexpected aliases involving claude or other AI agent CLIs. The pattern Cisco used is one example. Other patterns are possible.

Treat AI agent context files the same way you treat configuration management. They influence behavior. They deserve change control, integrity monitoring, and review.

Reconsider how your developers consume open source dependencies. The --ignore-scripts flag for npm exists for a reason. Lockfile review, dependency pinning, and SBOM tracking matter more, not less, when AI agents are part of the development loop.

If you operate in a regulated environment, document this risk in your AI usage policies. The fact that an attacker could persistently influence the security guidance your developers receive is a finding most auditors will eventually start asking about.

For PCI environments specifically. PCI DSS v4.0.1 Requirement 6.2.4 covers software development security and 6.3 covers vulnerabilities in custom and third-party software. If developers writing code that touches the cardholder data environment are using AI coding assistants with persistent memory, that toolchain is in scope for your secure software development controls. Document the controls. Verify the version. Don’t assume the agent is giving secure advice by default.

The bigger picture
#

The patch in v2.1.50 removes user memories from the system prompt. That closes the specific vector Cisco found. It does not solve the underlying problem, which is that we’re deploying autonomous tools with broad system access and persistent state, and our existing security frameworks weren’t designed for them.

Prompt injection remains an unsolved problem in production AI systems. Memory and context data will continue to be a target precisely because they offer attackers exactly what they want: persistence, influence, and a trust relationship with the user that the user themselves established.

For engineering teams, this is a reminder that AI coding assistants are not neutral utilities. They’re trust relationships, and trust relationships need governance. For GRC teams, it’s a reminder that the NHI conversation now extends into the cognitive layer of the development environment.

The fix is shipped. The architectural lesson is the part worth keeping.

Sources

Identifying and remediating a persistent memory compromise in Claude Code, Cisco Blogs, April 1, 2026
Anthropic Claude Code v2.1.50 release

Author

Juan Carlos Munera

Passionate about cybersecurity, governance, risk, and compliance. Sharing insights on security best practices, frameworks, and industry trends.

What Cisco actually found#

The attack chain#

More consequential than a typical CVE#

The non-human identity angle#

The supply chain dimension#

What Anthropic clarified about responsibility#

What to do about it#

The bigger picture#

Related