The Week Both Sides of the Supply Chain Got Industrial

Between May 21 and May 22, 2026, four announcements landed within 48 hours of one another. GitHub disclosed that 3,800 of its own internal repositories had been exfiltrated through a malicious VS Code extension. Anthropic published the first numbers from Project Glasswing, the restricted security model program, with more than 10,000 vulnerabilities surfaced in critical software in a single month. The Anthropic Red Team released exploit-evals for Mythos Preview, which solved 21 of 41 ExploitBench CVEs while every other model managed two or fewer. Perplexity open-sourced Bumblebee, a read-only scanner that treats agent endpoints (extensions, MCP configs, lockfiles) as scannable surfaces.

None of these were coordinated. They still describe one event.

The AI-era supply chain crisis has crossed the industrial threshold on both sides. The offense side has flywheel mechanics, named victims, and a price list. The defense side has eval scores, a partner pipeline, and an open-source first artifact. The intermediate question, the one engineering and security leaders need to answer this week, is not whether agent endpoints need controls. It is whether an inventory of those endpoints exists at all.

The Offense Side: TeamPCP Reached the Top of the Stack

We have written about TeamPCP through individual incidents before. Clinejection showed a single npm package compromising Cline installs. The Mercor wave showed the same operator hitting AI training data infrastructure. Prompt injection as a supply-chain weapon traced the technique into the model loop itself.

What May 21 added is the floor TeamPCP had not yet touched: the platform that hosts the supply chain.

GitHub CISO Alexis Wales confirmed 3,800 internal repositories were exfiltrated through a single VS Code extension. The asking price on BreachForums was $50,000. Aikido Security tracked the takedown windows: 18 minutes on the VS Code Marketplace, 36 minutes on Open VSX. Fast response, in absolute terms. Still 54 minutes during which a poisoned extension was the default download channel for a critical developer surface.

The broader campaign numbers, published by Wiz, Socket, and Palo Alto Networks the next day, frame the scale:

20 distinct supply-chain waves over the year
500+ poisoned packages, more than 1,000 counting versions
Confirmed downstream victims include OpenAI (two employee devices), Mistral AI, Mercor, the European Commission public site, TanStack, LiteLLM, Trivy, and AntV

The economic logic is straightforward. A poisoned extension that runs on a GitHub engineer’s laptop returns more value than one that runs on a junior developer’s hobby project. TeamPCP is now operating at the layer where developer tooling itself is the target. Every layer above (npm registries, language ecosystems, framework maintainers) already absorbed waves earlier in the year. The platform layer was the remaining ceiling.

That ceiling has been pierced.

The Defense Side: Glasswing Showed Offense AI Scales Defense AI

Project Glasswing is Anthropic’s restricted-distribution security model: more capable than the public Claude line, accessible only to vetted security partners under specific use restrictions. The governance model has been documented since April. The May 22 initial update is the first time the program reported what it found.

The numbers carry weight because they are field-tested, not benchmark-tested:

10,000+ vulnerabilities across systemically important software in one month
Roughly 50 active partners
6,202 high- or critical-severity vulnerabilities discovered across 1,000+ open-source projects
Cloudflare alone surfaced 2,000 bugs and reported a false-positive rate “better than human testers”
Firefox 150 generated 271 vulnerabilities versus Firefox 148, a 10x increase attributable to running Opus 4.6 against the same codebase

The strategic claim Glasswing validates is older than the data: offensive AI capability and defensive AI capability scale on the same curve. If a model can construct an exploit chain, the same model can find the conditions that enable that chain. The question has never been which capability arrives first. They arrive together. Governance determines which one reaches the field at scale.

Glasswing is the first program where the defense-side reach was measured against the offense-side reach in the same month. Defense reached further. Restricted distribution made that possible.

Mythos Preview: The Exploit Eval Becomes a Commodity Benchmark

The Anthropic Red Team’s exploit evaluation paper is the third leg of the May 22 stool. It is also the most uncomfortable.

Mythos Preview solved 21 of 41 ExploitBench CVEs by writing arbitrary code execution exploits. Every other tested model solved two or fewer. Mythos was the only model that escaped a V8 sandbox. The performance doubling time, measured against the prior generation, was 0.7 months. The prior doubling was 1.1 months. On SCONE-bench, the smart-contract exploit eval, the dollar value of successfully exploited contracts crossed $35 million.

The numbers matter less than the trajectory. Multi-step exploit construction, which 12 months ago required a senior offensive security researcher, is now a model capability. Restricted distribution slows the commodity arrival but does not stop it. The exploit eval is now a benchmark that frontier labs publish against each other. Open-weights catch-up is a question of months.

Bumblebee: The First Defender-Side Agent-Endpoint Scanner

Open-source AI offense has been the asymmetry we have been tracking. Defense had no equivalent artifact pointed at the surfaces agents actually touch.

Perplexity’s Bumblebee, open-sourced on May 22, is the first one to ship.

The design choices reveal what defenders had been missing:

Bumblebee scans four endpoint surfaces: language package managers (npm, pip, cargo, gem, others), MCP configuration files, VS Code-family extensions (VS Code, Cursor, Windsurf), and browser extensions.
It is read-only by design. It does not invoke npm install, does not trigger postinstall hooks, does not run the code it inventories. The reason is explicit in the project README: any active scan triggers the exact payload Bumblebee exists to find.
Perplexity Computer, the agent that drafts the catalog, opens pull requests for human review. The agent does not auto-commit the inventory.

The artifact’s existence shifts the conversation. MCP configuration files now have a scannable inventory format. VS Code extension installations now have a defender-oriented enumerator. The argument that “we cannot inventory what we do not have a tool for” no longer applies. The tool exists, is free, and is open source.

The Governance Levers Are Named

Three controls are now concrete enough for a Q3 governance plan:

Long-lived credentials in developer tooling. The GitHub breach worked because a VS Code extension running on an engineer’s laptop carried the access to read internal repositories. The compute boundary, the data boundary, and the identity boundary collapsed into one process. The fix is not a new policy. The fix is workload-identity federation reaching developer extensions, which is where it has been absent.

Extension review as a first-class control. VS Code Marketplace and Open VSX both took the malicious extension down inside an hour. That is a reactive control. The proactive control is treating extension installs the way enterprise security treats software installations on a production server: an approval queue, a signed manifest, a per-version sign-off. Most organizations do not run this for developer tooling because no one previously asked.

MCP configuration inventory. Bumblebee is the artifact that makes this enumerable. The question to ask in the next platform team meeting: “Which agents on which machines load which MCP servers, and where are the configs stored?” If the answer is “we do not know,” the work starts there.

Do This Now

Block 45 minutes this week. Run Bumblebee against one engineering laptop and one developer container image. The output is a draft pull request listing every language package, every MCP configuration, every VS Code extension, every browser extension found. Read it. Two surprises are typical: an extension nobody remembers installing, and an MCP config pointing at a service nobody on the team owns.

That output is the inventory. The inventory is the precondition for governance. Everything else, the policies, the approvals, the federation, presupposes that you can list what you have. The Anthropic Glasswing data confirmed defense AI works at scale. The TeamPCP campaign confirmed offense AI is operating at the platform layer. Mythos confirmed the capability gap closes in months, not years. Bumblebee removed the last excuse for not enumerating the surfaces.

The teams that win the next two years of agent operations are not the ones with the most autonomous agents. They are the ones who can answer, in writing, what their agents reach.

This analysis synthesizes GitHub internal repositories exfiltrated via malicious VS Code extension (ITPro, May 2026), A hacker group is poisoning open-source code at an unprecedented scale (Ars Technica, May 2026), Project Glasswing: An Initial Update (Anthropic, May 2026), Measuring LLMs’ Ability to Develop Exploits (Anthropic Red Team, May 2026), and Perplexity is open-sourcing Bumblebee (Perplexity, May 2026).

Victorino Group helps enterprises inventory and govern their agent endpoints before the next supply-chain wave reaches them. Let’s talk.