Your Agent Runtime Is the New Persistence Surface

TV
Thiago Victorino
9 min read
Your Agent Runtime Is the New Persistence Surface
Listen to this article

On May 11 and 12, 2026, the Mini Shai-Hulud campaign reached a scale that flipped a familiar story onto its head. Socket counted 416 compromised package artifacts across npm and PyPI. Aikido logged 373 malicious package-version entries spread over 169 npm names. Endor Labs put the npm number above 160. The list of affected namespaces reads like an inventory of the modern stack: TanStack, Mistral AI, Guardrails AI, OpenSearch, UiPath, Bitwarden CLI, and SAP’s official packages. @tanstack/react-router alone draws 12 million weekly downloads.

The familiar story is “a malicious package was published, audit your dependencies, pin your versions.” That story is no longer enough. The novel mechanic, reported by BleepingComputer and confirmed by Snyk and StepSecurity, is what the malware does after npm install finishes. It writes itself into Claude Code hooks and VS Code auto-run tasks. Uninstalling the bad version does not remove the infection. The package is gone. The hook stays. The agent runtime executes the payload on every subsequent session.

That is a category shift, and it deserves to be named. The agent runtime, the dotfiles directory you treat as developer configuration, is now a persistence-grade surface.

What Actually Happened

Snyk and StepSecurity describe a coordinated, self-propagating campaign attributed to a group calling itself TeamPCP, building on the original Shai-Hulud worm that hit npm in late 2025. The mechanic is consistent across detections from Socket, Endor Labs, Aikido, StepSecurity, Snyk, Microsoft Threat Intelligence, SafeDep, and Wiz. The malware harvests credentials from the developer machine on first run: SSH keys, AWS credentials, GitHub CLI tokens, npm tokens, RubyGems credentials, .netrc files, Kubernetes service accounts. It exfiltrates them, then uses the stolen publishing tokens to push poisoned versions of packages owned by the victim. That is the self-spreading loop. One compromised maintainer machine becomes the publishing source for the next wave.

The persistence step is the new piece. BleepingComputer reports that the payload installs itself into Claude Code hooks and VS Code auto-run tasks on the developer machine. Hooks fire on documented agent events. Auto-run tasks fire on workspace open. Both surfaces execute shell commands. Both surfaces live in version-controlled or user-level directories that no endpoint detection tool watches for code execution. The malware persists across npm uninstall. It persists across deleting node_modules. It persists across reinstalling clean versions of the original package. The package was the delivery vehicle. The agent runtime is the lodging.

A useful parallel exists in the Ruby world. The RubyGems security team, drawing on tooling that originates from Maciej Mensfeld’s supply-chain work, runs static and dynamic analysis on every gem published to the registry. Their blog reports that this pipeline catches an estimated 70 to 80 percent of malicious gems before public disclosure. Note where that pipeline lives. It lives at the registry. Not on the developer machine. Not in the agent’s hook directory. The current attack route bypasses the registry by waiting for the install to land, then writing into a directory the registry never sees.

Why This Is a Category Shift

Software supply-chain defense, as practiced today, sits on three assumptions. Assume the registry is the chokepoint, so we scan packages there. Assume the package is the unit of compromise, so we pin versions and rotate when a bad one ships. Assume the install step is the boundary, so removing the bad version returns the system to a known state. Mini Shai-Hulud breaks the third assumption in public.

Once a malicious package can write to .claude/hooks.json or .vscode/tasks.json, removing the package is no longer the cleanup procedure. It is the start of forensics. You now have to enumerate every agent-controlled execution surface on the machine, diff it against a clean baseline, and prove that no hook, no task, no startup script, no MCP server definition was added or modified. Most teams cannot answer that question for a single laptop, let alone a fleet.

This is the third act in a story we have been tracking. The first act was prompt injection as a supply chain weapon, where attacker-controlled text steered an agent into leaking credentials. The second act was Clinejection, where an AI triage bot itself became the vehicle that pushed a poisoned package to thousands of installs. This third act is persistence. The package is the messenger. The agent runtime is the bunker.

If you saw the two-front nature of the supply-chain crisis as primarily about velocity, this update is about depth. Attackers are no longer racing through the front door. They are renting the basement.

Your Config Directory Is Now a Control Plane

Treat the framing operationally. A control plane is any surface that decides what code runs, when it runs, and with what privileges. By that definition, .claude/, .vscode/, .cursorrules, .mcp.json, .aider.conf.yml, and every other agent configuration directory has been a control plane for at least a year. We just did not govern them like one.

Compare the hygiene around three surfaces.

The cron table on a server. Anyone proposing to add a cron job goes through change management. The job is reviewed. It is checked into infrastructure as code. It is monitored. Modifications trigger alerts.

A systemd unit file. Same answer. Reviewed, version-controlled, alerted on, owned by platform engineering.

A hooks.json file in a developer’s home directory. Nobody reviews it. It is not in any inventory. No tool alerts when its hash changes. It can spawn shells, send network traffic, modify files, and read secrets, and the standard endpoint stack does not blink. This is not a hypothetical. This is the durable lodging that Mini Shai-Hulud reached for, because attackers always reach for what is least watched.

The agent runtime sits inside the developer trust boundary by default. That boundary made sense when the only thing inside it was the developer’s own editor configuration. It does not make sense now that the boundary contains a runtime that executes arbitrary instructions from text that arrives over the network.

What To Do This Week

The remediation is unglamorous and concrete. None of it requires new vendors.

Audit every agent hook directory in your environment. Treat .claude/, .vscode/, .cursor/, .mcp.json, .aider*, ~/.config/claude*, and equivalent paths as privileged. Snapshot a clean baseline. Diff against it. Alert on any unauthorized modification, the same way you would alert on a new cron entry on a production host.

Move those directories into version control where they belong, and require review on changes. If a developer cannot articulate why a hook exists, the hook should not exist.

Treat package installs as untrusted execution events, not configuration events. Use lockfile-only installs (npm ci, pip install --require-hashes, bundle install --frozen) on every machine that touches production credentials. Forbid post-install scripts on CI runners that hold publishing tokens. Where you cannot forbid them, sandbox them.

If any machine in your organization installed any version of the affected packages, treat that machine as breached until proven otherwise. The compromised namespaces include TanStack, Mistral AI, Guardrails AI, OpenSearch, UiPath, Bitwarden CLI, and SAP official packages. Rotate every credential that the machine could reach: SSH, AWS, GitHub, npm, RubyGems, GitHub CLI, Kubernetes service accounts. Rotate them and verify the old credentials are dead. The Clinejection incident showed what “rotated but still working” costs.

Separate the agent’s read scope from the agent’s execution scope. An agent that needs to read source files does not need to write hooks. An agent that needs to write hooks does not need network access. Most teams have not drawn that line yet.

Push your registries to do what RubyGems does. Static and dynamic analysis at publication time will not catch everything, but the RubyGems blog reports a 70 to 80 percent catch rate on malicious gems before public disclosure. That is the kind of upstream defense that compounds. The current attack would still install. The current attack would not have shipped this widely.

The agent runtime is now part of your attack surface inventory. If it is not on the inventory, the inventory is wrong.


This analysis synthesizes Shai Hulud attack ships signed malicious TanStack, Mistral npm packages (BleepingComputer, May 2026), TanStack npm Packages Hit by Mini Shai-Hulud (Snyk, May 2026), and Mini Shai-Hulud Is Back (StepSecurity, May 2026).

Victorino Group helps governance, security, and engineering leaders rebuild the containment story around their AI tooling. Let’s talk.

All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation