The Workflow Is the Governance Primitive

TV
Thiago Victorino
8 min read
The Workflow Is the Governance Primitive
Listen to this article

Jakob Nielsen spent 43 years arguing that software fails at the interface. The usability heuristics you learned in school are his. The Nielsen Norman Group is his. Seventy-nine patents on internet usability are his.

Last week, at UX Tigers, he published a piece that reverses the direction of his own career. The headline: AI’s design problem is organizational, not technological. The sentence that matters: “the real design problem is not just interaction design. It is organizational design expressed through software.”

Read that again. The person who built the field of usability is saying the interface is not where the AI problem lives.

This is the rare moment when a domain authority publicly revises the scope of their own domain. It deserves attention. But the useful move is not to quote Nielsen. The useful move is to take his argument one step further and make it testable.

What the data actually says

Nielsen is not speaking from taste. He is synthesizing three pieces of evidence that converge on the same point.

First, Bick, Blandin, and Deming’s Brookings paper from this spring. They decompose the gap between 43% US GenAI adoption and 32% EU adoption and find that more than 80% of it disappears once you control for one variable: whether managers explicitly encourage AI use. Not training. Not infrastructure. Not culture in the abstract. Explicit manager behavior. Measured on the HR side, correlated with time-use diaries on the worker side. The variables are independent. The finding survives controls for industry, firm size, country, and demographics.

Second, the Stanford AI Index 2026. Eighty-eight percent of organizations report adoption. Seventy percent use GenAI in at least one business function. Productivity gains concentrate in a small leading cohort. Stanford calls the distance between “everyone uses AI” and “most organizations have not changed how work gets done” an execution gap.

Third, MIT NANDA’s “GenAI Divide” report. Ninety-five percent of enterprise pilots produce zero measurable P&L impact. The models are not the problem. The operating model is.

Three independent studies, three methodologies, one conclusion. The binding constraint on AI outcomes is not the model and not the interface. It is the organization.

The extension Nielsen stops short of

Nielsen’s prescription is that products must embed governance scaffolding, policies, approved workflows, and transparency dashboards directly in the tool. He is right. But he stops one step short.

Here is the step he leaves on the table: the workflow is the governance primitive.

Permissions are derivative. Policies are derivative. Audit logs are derivative. They are shadows cast by the shape of the work itself. If the workflow is designed correctly, trust is produced by the handoff, not asserted by the policy. Compliance is observed in the step sequence, not enforced by a gate. Oversight is measured by what the workflow reveals, not by who was allowed in.

This is not a rhetorical flourish. It is a testable claim. Three ways to test it:

  1. Correlate workflow structure with downstream error rate. If the shape of the work predicts quality better than the access list does, workflow is the primitive.
  2. Compare management encouragement against training inside the same firm. Bick, Blandin, and Deming already did half this work at the country level. The same experiment inside an enterprise is trivial to run.
  3. A/B test centaur workflows against handoff workflows on mixed human-plus-AI teams. Measure output quality, time-to-completion, and error-recovery cost.

Workflow-as-governance-primitive is a hypothesis Victorino is building infrastructure to test. It is not yet a proven law. But it is falsifiable, which is more than can be said for most AI governance frameworks currently in circulation.

What Ramp and Meta tell us

Two firms are running the experiment at scale, at opposite ends of the coercion spectrum.

Ramp’s Chief Product Officer, Geoff Charles, has published what amounts to Nielsen’s thesis in production form. An L0 to L3 proficiency ladder. Public measurement of team adoption. Deliberately removed friction. A definition of “good” followed by permission to build toward it. The reported result: 99.5% active AI users at a $32B company. As we covered in Ramp’s org-design-first AI adoption playbook, the 99.5% number is self-reported and should be handled with that caveat. The playbook, however, is the real asset. The levers Ramp pulled are the same levers Bick, Blandin, and Deming identified: encouragement, visible measurement, workflow ladders, removed friction.

Meta is the hard-edge version. As we detailed in Meta’s AI-native structural mandate, the creation organization’s goal is that 65% of engineers ship more than 75% of their committed code with AI assistance by H1 2026. Up to 15% of performance appraisals are weighted toward AI contribution. Roughly 8,000 layoffs in May reallocated headcount to AI-focused pods. Some pods run 50 individual contributors per manager. If Bick, Blandin, and Deming are right that encouragement explains 80% of adoption variance, then Meta’s coercion-scale encouragement is rational, even if it is brutal.

Meta is a bet, not a proven case. The mandate is months old. Outcomes are not measurable until late 2026 at the earliest. Meta might hit the 75% AI-code target with brittle code and then fire the humans who could have fixed it. It might produce a Conway’s Law failure where flat pods cannot integrate. It might ship a productivity boost followed by a maintenance cliff. We should cite Meta as evidence of belief, not evidence of success, and the tension between mandate and mental model is exactly what we examined in governance meets mandates.

Ramp and Meta bracket the space. Both are running Nielsen’s thesis. The question for every other firm is not whether to follow one of them. It is how to redesign workflows so that encouragement, measurement, and removed friction compound inside the work itself rather than around it.

Where the thesis could be wrong

Honesty requires naming the counter-cases.

Small shops with high skill density routinely out-ship large enterprises with formal workflow design. Cursor, early Anthropic, consultancy solo operators. At small scale, individual skill substitutes for workflow design. The thesis is scale-dependent. Under 50 people, workflow shape is probably noise. Above 500, it is probably binding.

Bad workflow design is worse than no workflow design. Many Fortune 500 firms have rolled out AI governance councils, approved-tool catalogs, and training programs, and they still sit squarely in the MIT 95% no-ROI bucket. The thesis only works when workflow design is executed well: Ramp-style ladders, public measurement, removed friction. Committee-driven process theater produces the opposite. This is also why governance is leaving the engineering silo and landing in every function at once. Workflow design is a general-purpose skill, not a technical one.

Capability jumps could flatten the curve. If model reliability reaches 95% on enterprise tasks, the jagged frontier smooths out and management encouragement may matter less. This thesis has a capability-regime dependency. Workflow-as-governance is the primitive in the current regime, not a permanent law. What persists across regimes is the measurement itself, which is why evaluation-driven agent operations is a companion discipline regardless of where model capability lands.

The testable claim, stated plainly

The workflow, not the permission and not the model, is the governance primitive.

Permissions are what we put on the door. The workflow is the shape of the room. Policies are what we write on the wall. The workflow is what actually happens inside. If we measure the workflow, we see governance happening in real time. If we only measure the permission, we see who was allowed near the door.

This claim is falsifiable. If workflow-design-first is wrong, we should expect Fortune 500 AI ROI to be uncorrelated with workflow maturity by the end of 2027. If it is right, the cohort that measures and redesigns workflows will pull away from the cohort that writes policy.

Nielsen took the field of usability to its honest conclusion: the problem is not the interface. The next honest conclusion is that the problem is not the permission either. It is the shape of the work. Build the work correctly and governance is emergent. Build the work badly and no amount of policy recovers it.


This analysis synthesizes Jakob Nielsen, “AI Use in the Real World: AI’s Design Problem Is Organizational, Not Technological” (April 2026), Bick, Blandin & Deming, “Mind the Gap: AI Adoption in Europe and the U.S.” (NBER Working Paper 34995, Spring 2026), Stanford AI Index 2026 — Economy chapter, and MIT NANDA, “The GenAI Divide: State of AI in Business 2025”.

Victorino Group helps leaders redesign workflows so AI becomes usable, trustworthy, and measurable. Let’s talk.

All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation