The AI Control Problem

The AI Adoption Spectrum: What the 6x Productivity Gap Really Means for Enterprises

TV
Thiago Victorino
9 min read

In December 2025, OpenAI published an enterprise report based on roughly 100 companies and 9,000 workers. The headline finding: workers at the 95th percentile of AI usage are six times more productive than the median employee. For coding tasks, the gap widens to 17x. Frontier firms generate twice as many AI messages per employee and seven times more through custom-built GPTs.

These numbers describe a real phenomenon. But the framing matters enormously.

Martin Alderson recently argued that two distinct kinds of AI users are emerging: power users running Claude Code, MCPs, and custom Python environments on one side, and mainstream users doing basic ChatGPT prompting on the other. His observation is directionally correct. There is a widening capability gap. But the binary framing — two camps, cleanly divided — obscures more than it reveals.

The reality is a spectrum. And misunderstanding its shape leads to exactly the wrong strategic response.

The Spectrum, Not the Binary

A binary framing is seductive because it simplifies decision-making. If there are two kinds of users, the answer seems obvious: move everyone from the low group to the high group. Remove the barriers. Install the tools. Problem solved.

But the OpenAI data tells a more complicated story. The 6x gap is not between “users” and “non-users.” It is between the 95th percentile and the median — among people who are all already using AI. The distribution is continuous, not bimodal.

This distinction matters for strategy. A bimodal distribution implies a barrier to cross. A continuous distribution implies a capability to build. These are fundamentally different organizational problems.

Only 6% of enterprises have moved generative AI beyond pilot to production, according to Gartner. Copilot penetration sits at roughly 3% of Microsoft’s 450 million paid seats. Most organizations are not even at the median, let alone the frontier. The gap is real, but it is not a gap between two kinds of people. It is a gap between organizational maturity levels.

The Microsoft Paradox

No single data point illustrates the complexity better than Microsoft’s own behavior.

In January 2026, Microsoft directed engineers across Windows, Microsoft 365, Teams, Bing, Edge, and Surface to install Claude Code alongside Copilot. Reports indicate that Satya Nadella internally acknowledged that Copilot integrations “don’t really work.”

Read that again. The company that sells Copilot to 450 million users told its own engineers to use a competitor’s tool.

This is not hypocrisy. It is evidence that the adoption spectrum is real even within the most sophisticated technology companies. Microsoft’s Copilot works for surface-level code completion — the kind of assistance that helps the median user. But for the deep, agentic workflows that frontier workers rely on — multi-file refactoring, codebase-wide reasoning, complex debugging — Microsoft’s own engineers needed something different.

The lesson is not that Copilot is bad. It is that different positions on the adoption spectrum require fundamentally different tooling. A single product cannot serve the entire range.

The Perception Problem

Before assuming that moving everyone to frontier tooling solves the problem, consider what METR found in their randomized controlled trial.

Experienced open-source developers — people with over five years of contribution history to the codebases they were working on — were given AI coding tools for a set of real tasks. They perceived a 24% speedup. Actual measurement showed they were 19% slower.

A 43-percentage-point perception gap.

This finding does not mean AI tools are useless. It means that subjective experience is a dangerously unreliable guide to AI productivity. Developers felt faster because AI tools reduce certain cognitive loads — looking up syntax, generating boilerplate, exploring unfamiliar APIs. That relief feels like speed. But in mature codebases with established patterns, the overhead of prompting, evaluating, and correcting AI output exceeded the time saved.

The implication for the adoption spectrum is significant. Organizations cannot simply measure who is “using AI” and who is not. They need to measure what AI usage actually produces. And most of them are not doing this. Eighty percent of engineering teams, according to Jellyfish data, cannot answer basic questions about their AI ROI.

Why the Enterprise Problem Is Not What You Think

The common narrative goes like this: enterprises are slow, locked down, and bureaucratic. Smaller companies with modern tooling can move faster and outperform them. Remove the enterprise constraints and the gap closes.

This narrative has elements of truth. Enterprise environments often feature locked-down developer machines, no access to external APIs, siloed engineering teams, and security policies that prevent the installation of tools like Claude Code. These constraints are real friction.

But calling them the problem confuses cause and effect.

Enterprise constraints exist because enterprises operate at a scale where uncontrolled AI usage creates genuine risk. A startup developer running Claude Code on a personal laptop with full internet access has different threat surface than an engineer at a financial institution with access to customer data, regulatory obligations, and compliance requirements that carry real legal consequences.

The Microsoft example is instructive here, too. Microsoft did not tell its engineers to ignore security controls. It built a governed path to using Claude Code that met its internal requirements. The constraint was not eliminated. It was engineered around.

This is the critical reframe: the enterprise problem is not “remove barriers to AI adoption.” It is “build governed infrastructure that enables sophisticated AI usage without creating uncontrolled risk.”

The Data Moat Advantage

The binary narrative also misses something important about where enterprises actually win.

Power users at startups can run sophisticated AI workflows on public data, open-source codebases, and general knowledge. That is genuinely powerful. But the highest-value AI applications require proprietary data: customer behavior patterns, operational history, domain-specific knowledge, institutional context.

Enterprises have this data. Startups mostly do not.

The organizations that will capture the most value from AI are not the ones with the fewest constraints. They are the ones that can connect frontier AI tooling to proprietary data through governed infrastructure. This combination — sophisticated tools plus unique data plus operational governance — is not something a two-person startup can replicate by installing Claude Code.

The 7x gap in custom GPT usage that OpenAI found at frontier firms is not about individual skill. It is about organizational infrastructure. Custom GPTs require internal data, defined workflows, and integration with enterprise systems. Building them well requires exactly the kind of institutional capability that enterprises possess and startups lack.

Governance as Force Multiplier

There is a persistent assumption that governance and speed are inversely correlated. More governance means less agility. The fastest organizations are the least governed.

The data suggests the opposite.

Organizations with robust AI governance frameworks consistently outperform ungoverned ones, not because governance prevents mistakes — though it does — but because governance builds the organizational trust necessary to expand AI autonomy. When leadership trusts that AI usage is monitored, compliant, and auditable, they approve broader deployment. When they do not trust the controls, they restrict everything.

This is why 94% of enterprises remain stuck in pilot mode. It is not that they lack ambition or tools. It is that they lack the governance infrastructure that would let them scale responsibly. The governance is not the friction. The absence of governance is.

A well-governed AI environment lets an engineer use Claude Code on production code because the organization knows what data the tool can access, what actions it can take, and how its outputs are reviewed. An ungoverned environment either blocks the tool entirely or allows it without controls — both of which are worse outcomes.

What This Means for Strategy

If you are leading an enterprise AI strategy, the adoption spectrum framework suggests a different playbook than the binary one.

Stop sorting people into two buckets. The 6x gap is real, but it is a distribution, not a divide. Understand where your organization sits on the spectrum and design interventions for your actual position, not a theoretical one.

Invest in infrastructure, not just tools. The gap between frontier firms and median firms in the OpenAI data is not primarily about which AI tools are available. It is about custom GPTs, integrated workflows, and internal data connections. This is infrastructure work.

Build governance that enables, not blocks. Microsoft did not achieve frontier AI usage by removing controls. It achieved it by building better controls — ones that accommodated sophisticated tools within enterprise constraints. Your security team is not the enemy of AI adoption. An ungoverned environment is.

Measure actual outcomes, not adoption. The METR study proves that usage and productivity are not the same thing. Instrument your systems. Track what AI usage actually produces. Do not rely on developer surveys or adoption dashboards as proxies for value.

Leverage your data advantage. If you are an enterprise sitting on proprietary data, your competitive advantage is not about matching startup speed. It is about connecting AI capabilities to data that nobody else has, through infrastructure that ensures compliance and security.

The adoption spectrum will continue to widen. That is almost certain. But the organizations that thrive will not be the ones that simply moved fastest. They will be the ones that built the governed infrastructure to move fast repeatedly, at scale, on proprietary data, without creating the kind of risk that stops the whole program.

The gap is real. The framing determines whether you close it.


This essay draws on OpenAI’s Enterprise Report (December 2025), METR’s randomized controlled trial on AI developer productivity (July 2025), Gartner’s enterprise AI data (2025), Martin Alderson’s “Two Kinds of AI Users Are Emerging” (February 2026), and Jellyfish’s 2025 AI Metrics in Review.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation