The AI Control Problem

Your AI Provider Is a Supply Chain Risk

TV
Thiago Victorino
9 min read

On February 23, 2026, Anthropic published something that should have sent enterprise AI teams scrambling: a detailed account of how three Chinese AI companies — DeepSeek, Moonshot AI, and MiniMax — used approximately 24,000 fake accounts to extract Claude’s most differentiated capabilities through 16 million systematic interactions.

Eleven days earlier, OpenAI had submitted a memo to the U.S. House Select Committee on China making parallel allegations against DeepSeek. Days before that, Google reported a distillation campaign targeting Gemini with over 100,000 carefully crafted prompts designed to clone its reasoning capabilities.

Three frontier labs. Three simultaneous, industrial-scale extraction campaigns. One question nobody in enterprise AI is asking: What does this mean for the competitive advantage you’re building on top of these models?

What Distillation Actually Is

Knowledge distillation is a legitimate machine learning technique. You train a smaller model on the outputs of a larger one, transferring capabilities at a fraction of the original training cost. Companies do this internally all the time — distilling a massive model into something that runs on edge devices.

What Anthropic described is something different. This is adversarial distillation: competitors systematically extracting your model’s capabilities through API access, using infrastructure designed to evade detection.

The mechanics are sophisticated. MiniMax alone generated 13 million exchanges, targeting Claude’s agentic coding and tool orchestration capabilities — precisely the features enterprises are building production workflows around. Moonshot AI used hundreds of fraudulent accounts across multiple access pathways, with attribution confirmed through request metadata matching senior staff profiles. DeepSeek focused on reasoning, rubric-based grading, and generating censorship-safe alternatives to politically sensitive queries.

One proxy network managed over 20,000 simultaneous fraudulent accounts, mixing distillation traffic with legitimate customer requests. When accounts were banned, replacements activated automatically. Hydra cluster architecture: cut one head, two more appear.

The Supply Chain Angle You’re Missing

When a manufacturing company evaluates a critical supplier, it assesses risks: financial stability, quality control, geographic concentration, and — increasingly — cybersecurity posture. If a tier-1 supplier has poor security and a competitor extracts their proprietary manufacturing processes, the downstream customer loses their competitive advantage too.

AI model providers are now tier-0 suppliers. They sit underneath everything. Your agentic workflows, your custom fine-tunes, your production pipelines — all of it depends on capabilities that exist inside someone else’s model. When those capabilities can be extracted at scale, your competitive moat is downstream of someone else’s security.

Anthropic detected 24,000 fake accounts because they have dedicated security teams, behavioral fingerprinting systems, and chain-of-thought elicitation detection. They caught coordinated activity patterns across hundreds of accounts targeting narrow capability domains.

The question enterprise leaders should be asking is not “how did Anthropic catch them?” but “how many providers can’t?”

If you’re building on a smaller model provider, a fine-tuning platform, or a vertical AI tool — does that provider have behavioral fingerprinting? Do they detect coordinated account activity? Do they distinguish between legitimate use and systematic capability extraction? Most don’t. Many can’t.

Export Controls Don’t Protect What You Think They Protect

The U.S. chip export control strategy rests on an assumption: limiting access to advanced hardware limits the ability to train frontier models. Distillation attacks expose the flaw in this reasoning.

You don’t need H100 clusters to distill capabilities from an existing model. You need API access and enough accounts to extract training data at scale. As Anthropic’s own report notes, “apparent rapid advancements appear as evidence of ineffective controls, though advancements depend significantly on extracted American model capabilities.”

This creates a paradox. Export controls restrict hardware. But the extraction happens through software — through the very APIs that companies sell commercially. The policy protects the training pipeline while leaving the knowledge pipeline wide open.

When MiniMax pivoted within 24 hours of a new Claude model release, redirecting nearly half their distillation traffic to the new version, they demonstrated adversarial agility that neither export controls nor terms of service can match.

Representative John Moolenaar, chair of the House China committee, framed the dynamic bluntly: “This is part of the CCP’s playbook: steal, copy, and kill.” Whether you accept the geopolitical framing or not, the supply chain risk is real regardless of who’s doing the extracting.

What Gets Lost in Distillation

The most dangerous aspect of distilled models isn’t the capability theft. It’s what doesn’t transfer.

Distillation captures outputs — the patterns, reasoning traces, and responses that make a model useful. It does not capture the safety training: the RLHF, the Constitutional AI, the red-teaming, the refusal boundaries that took months of alignment work.

Anthropic describes this directly: “Illicitly distilled models lack necessary safeguards.” A model distilled for agentic coding capabilities inherits the coding ability but not the guardrails that prevent that ability from being applied to offensive cyber operations. A model distilled for reasoning inherits the reasoning but not the alignment that prevents that reasoning from being weaponized.

When these distilled models are open-sourced — as DeepSeek R1 already is — the stripped capabilities proliferate beyond any government’s control. This is not a theoretical risk. It’s the current state of play.

What This Means for Your AI Strategy

If you’re an enterprise leader building on AI model capabilities, the distillation threat reframes your risk calculus:

Model provider security is now a procurement criterion. You evaluate your cloud provider’s SOC 2 compliance. You audit your SaaS vendor’s data handling practices. Why aren’t you evaluating your AI model provider’s resistance to capability extraction? Ask them: What detection systems do you have for distillation attacks? How do you identify coordinated account activity? What happened the last time someone tried to systematically extract your model’s capabilities?

Your competitive advantage has a provenance problem. If the capabilities you’re building on can be extracted and redistributed through distillation, any competitor with API access to the same provider has a shortcut to your capabilities. The question is not whether your implementation is unique — it’s whether the underlying model capabilities are defensible.

Shadow AI multiplies the attack surface. If departments in your organization are using AI tools and platforms without centralized governance, each shadow deployment is a potential link in someone else’s distillation chain. Your employees’ prompts and usage patterns contribute to the signal that distillation attackers analyze.

Vendor concentration creates systemic exposure. If 80% of enterprise AI runs on three providers — and all three are simultaneously under industrial-scale extraction campaigns — the systemic risk is not hypothetical. It’s the current operating environment.

What Governance Looks Like Here

This is not a problem that better model cards or responsible AI principles can solve. This is supply chain governance applied to a new category of critical dependency.

The organizations that navigate this well will be the ones that:

Map their AI supply chain. Not just “which models do we use?” but “which capabilities do we depend on, where do those capabilities reside, and how defensible are they?” This requires treating model providers with the same rigor as any critical infrastructure supplier.

Assess provider security posture. Every model provider should be able to articulate their detection and prevention capabilities for distillation attacks. If they can’t, that’s a risk factor — not a disqualifier necessarily, but something that belongs in your risk register.

Diversify intelligently. Single-provider dependency on a model whose capabilities are being actively extracted is a concentration risk. This doesn’t mean abandoning frontier models. It means understanding which capabilities are most exposed and building redundancy where it matters.

Monitor the landscape. Anthropic’s report, OpenAI’s congressional testimony, and Google’s threat intelligence disclosures are the beginning of a pattern, not isolated events. The organizations that track this emerging risk category will be the ones that aren’t caught off guard when the next disclosure happens.

No company solves this alone. Anthropic’s conclusion is correct on that point. But waiting for industry coordination while building production systems on extractable capabilities is not a governance position. It’s a hope dressed up as strategy.


The distillation data referenced in this article comes from Anthropic’s February 23, 2026 report. OpenAI’s congressional testimony was reported by Bloomberg on February 12, 2026. Google’s distillation attack disclosures come from the GTIG AI Threat Tracker. All specific numbers (24,000 accounts, 16 million exchanges) are Anthropic’s self-reported figures; no independent audit has been published.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation