McKinsey's Three Diagnoses: How a Consulting Firm Slowly Discovers Governance

In the span of four months, McKinsey published three pieces that tell a story the firm itself has not fully articulated. Each one gets closer to the same conclusion. None of them states it outright.

The pattern matters more than any individual article. Because when a firm with McKinsey’s reach iterates toward a diagnosis it cannot fully name, the iteration itself becomes the evidence.

Diagnosis One: The Measurement Was Fine

In November 2025, McKinsey surveyed 300 executives about AI’s impact on software development. The executives reported 16-45% productivity gains. McKinsey packaged those perceptions as findings.

The problem, as we explored in McKinsey Measured the Wrong Thing, was not the conclusion. It was the methodology. METR’s randomized controlled trial found developers 19% slower with AI while believing they were 24% faster. The NBER’s February 2026 survey of 6,000 executives found more than 80% report zero measurable productivity gains. The perception-measurement gap was not noise. It was directionally wrong.

At that point, McKinsey’s position was implicitly: AI works, executives confirm it works, the challenge is organizational adoption. The prescription: upskilling, change management, end-to-end implementation, AI-native roles. Five recommendations that mapped precisely to five McKinsey service lines.

Diagnosis Two: The Problem Is Design

On March 11, 2026, Chris Smith, a partner in McKinsey’s design practice, published a Re:think essay arguing that AI’s scaling problem is not a technology problem. It is a design problem. Organizations bolted chat boxes onto pre-AI workflows and expected transformation.

Smith identifies four design characteristics AI experiences need: clarity (reveal reasoning), continuity (remember context), depth (automate workflows), and collaboration (humans and AI steering together, not correcting after the fact).

The case studies are compelling in their direction: a marketing tool with 75% adoption and 2%+ sales boost, a sales tool adopted by 90% of users, a hotel management tool where nearly all users deployed it once it revealed its reasoning.

The prescription shifted. It was no longer just organizational change. It was design. And Smith writes from McKinsey’s design practice, built through the acquisitions of LUNAR Design in 2012 and Veryday in 2015. The diagnosis, once again, pointed to what McKinsey sells.

As we analyzed in Design Without Governance Is Decoration, each of Smith’s four principles describes a desirable surface behavior while skipping the infrastructure required to produce it. Clarity requires explainability systems. Continuity requires data governance. Depth requires permission boundaries. Collaboration requires decision accountability.

Diagnosis Three: The Word McKinsey Used Without Defining

Here is the sentence that makes the three-article arc significant. In his closing paragraph, Smith writes:

“The next frontier of AI will be about reinventing the architecture of collaboration, driven by systems that make intelligence understandable, governable, and usable at scale.”

Governable. The word appears once, without elaboration, in a 800-word essay about design. Smith does not define what governable means. He does not describe what governance infrastructure looks like. He does not connect it to the four principles he just outlined. But he uses it.

This is the diagnostic progression in miniature. From “AI works, just adopt it properly” to “AI needs better design” to “AI needs to be governable.” Each step gets closer. Each step stops short of the structural conclusion.

What the Iteration Reveals

The pattern across McKinsey’s three pieces maps a common trajectory in how organizations discover governance needs.

Stage 1: Perception as evidence. The technology is adopted. Leaders report it is working. Nobody measures independently. The metrics that exist measure activity, not outcomes. Surveys confirm the investment thesis. Everyone is satisfied.

Stage 2: Design as diagnosis. When adoption stalls at pilot scale, the explanation shifts from “it works, just scale it” to “the user experience needs improvement.” This is a real problem. But it is the visible symptom, not the structural cause. Organizations redesign interfaces while leaving the underlying infrastructure unchanged.

Stage 3: Governance as requirement. Eventually, someone says the word. Usually quietly, in a subordinate clause, without a full framework attached. The recognition arrives before the architecture does. Smith’s use of “governable” is precisely this moment: the acknowledgment that design principles without governance infrastructure produce design theater.

Most organizations experience this same progression. They spend months in Stage 1, convinced AI is delivering value based on perception data. They invest in Stage 2 when adoption fails to scale, hiring design consultants to improve the experience. They arrive at Stage 3 when they realize the design improvements cannot be sustained without underlying systems for explainability, data governance, workflow boundaries, and decision accountability.

The Convergence Nobody Planned

McKinsey is not alone in this trajectory. The industry data is converging from multiple directions.

BCG’s AI Radar 2025 found only 26% of companies generate significant financial returns from AI. The top barriers they identified were data quality, talent gaps, unclear ROI measurement, and security concerns. “Poor design” did not make the list. Deloitte’s Q4 2024 report found 68% of organizations moved fewer than 30% of gen AI experiments to production. Their analysis pointed to governance and data infrastructure, not user experience.

Apple and Carnegie Mellon’s IUI’26 research found that agent trust collapses within a single interaction without substantive explanation. Not the appearance of transparency. The mechanism of accountability. Microsoft’s Magentic-UI project demonstrated that structured collaboration improves agent performance by 71%, but the improvement came from six governance mechanisms: co-planning, co-tasking, action approval, answer verification, memory, and multi-tasking.

Every research stream is arriving at the same place. The word differs — governance, infrastructure, accountability, guardrails — but the destination is the same. What makes AI scalable is not the model, not the interface, but the systems that make outputs verifiable, decisions traceable, and behaviors bounded.

Why Consulting Firms Cannot Say This Directly

There is a structural reason McKinsey’s diagnosis stops at design. Governance infrastructure is not a consulting engagement. It is an engineering and organizational build that requires permanent investment in systems, not a project with a defined end date and a final deliverable.

A design engagement has clear scope: audit the current experience, identify friction points, redesign the interaction, measure adoption. It starts and ends. It produces artifacts. It fits in a statement of work.

Governance infrastructure does not work that way. Explainability systems need continuous maintenance. Data governance policies need ongoing enforcement. Workflow boundaries need regular recalibration. Decision accountability needs permanent institutional commitment. These are operational capabilities, not project outcomes.

Consulting firms sell projects. Governance is not a project. This is not a criticism. It is a structural observation about why the diagnosis consistently stops one layer short of the structural answer.

The Practical Implication

If your organization is stuck in pilot purgatory, consider which stage of diagnosis you are in.

If you are still relying on executive perception to assess AI value, you are in Stage 1. The first step is not design. It is measurement. Build the infrastructure to know whether AI is helping before deciding what to fix.

If you have invested in design improvements and adoption still will not scale, you are in Stage 2. The interface is not the bottleneck. The infrastructure behind it is. No amount of UX improvement compensates for missing explainability, unmanaged data persistence, unbounded workflow automation, or unaccountable collaboration.

If you have arrived at Stage 3 and someone in the room has said the word “governance” but nobody has defined what it means for your organization, that is where the real work begins. Not a project. An organizational capability.

McKinsey will likely reach this diagnosis explicitly in a future publication. The trajectory is clear. In the meantime, the gap between where their diagnosis is and where the evidence points is the gap where organizations get stuck.

Sources: Chris Smith, “Improved user experiences could unleash the full potential of AI,” McKinsey Re:think, March 2026. McKinsey, “Supercharging software development with generative AI,” November 2025. METR randomized controlled trial, 2025. NBER executive survey, February 2026. BCG AI Radar 2025. Deloitte Q4 2024 gen AI report. Apple/CMU IUI’26 agent UX taxonomy. Microsoft Magentic-UI research.

Victorino Group builds the governance infrastructure that consulting diagnoses eventually point to. If your organization has moved past design and needs the systems underneath, let’s talk.