Only 17% of Design Systems Document AI, and One Ships a Governance Gate

TV
Thiago Victorino
6 min read
Only 17% of Design Systems Document AI, and One Ships a Governance Gate

We have argued on this site that the design system is the constraint layer for AI-generated interfaces. When a model produces a button, a form, a whole screen, the system is what decides whether that output is shippable. That claim was a bet. Now there is a number against it.

Romina Kavcic, who runs The Design System Guide, published a first-party census on June 1, 2026. She read through 156 public design systems and asked one question: does this system say anything at all about AI? The answer for 130 of them was no. Only 26 document AI in any form. Roughly 17%. This is her own dataset and her own analysis, not an industry-audited figure, so treat the precision as directional. The shape of the finding still holds: the constraint layer we keep pointing to mostly does not exist yet.

The Census Is the Story

Most commentary on AI and design treats governance as a settled debate that organizations are simply slow to adopt. The census says something harder. The discipline has not written the rules down. Eighty-three percent of the most visible design systems in the world, the ones teams copy and cite, offer no guidance on how AI fits into their components, their tokens, or their review process.

That silence is not neutral. A design system is documentation that gets enforced through linting, CI gates, and review. When the documentation is empty on AI, every team building AI features into a product is improvising against a system that has no opinion. The improvisation is invisible until it ships something the brand cannot defend.

So the interesting work is not in the 130 systems that say nothing. It is in the 26 that tried, and in what they happened to agree on without coordinating.

Five Levels of AI Readiness

Reading across the 26, a maturity ladder emerges. Kavcic’s framing maps onto five levels, and the distribution is lopsided toward the bottom.

L0, no guidance. The 130 systems with nothing. The default state of the field.

L1, decoration. AI shows up as a visual treatment. A shimmer, a gradient, an icon that signals “this came from a model.” Shopify sits here. The system knows AI exists and gives it a look, but says nothing about behavior.

L2, component. AI-specific components with documented states: loading, streaming, error, empty. IBM Carbon reaches this level. The system now treats AI output as a thing with a lifecycle, not just a style.

L3, interaction pattern. Documented patterns for how a user and an AI feature exchange control. Atlassian works at this level, specifying how suggestions surface, how a user accepts or rejects, how the system recovers.

L4, governance layer with enforcement. The system does not just describe good AI behavior, it gates on it. Microsoft Fluent 2 sits here, and it is nearly alone.

L5, system infrastructure. AI woven into the foundations of the system itself. Microsoft Fluent shows the only strong public signal at this level, and even that is partial.

The ladder matters because it separates intent from teeth. Levels 1 through 3 are description. A team can read them, agree with them, and ignore them under deadline, and nothing stops the bad screen from shipping. Only at level 4 does the documentation start refusing output.

What the Leaders Converged On

Here is the part that should change how you read the field. The systems that took AI seriously, working independently, landed on the same four principles. Convergence like this is a signal that the principles are load-bearing, not stylistic.

Mark AI content. The interface should make it visible when a model produced or shaped what the user is seeing. Not as decoration, as disclosure.

Explain decisions in layers. A What/Why/How structure, scaled to the risk of the action. A low-stakes suggestion needs a light touch. A consequential decision needs the reasoning exposed. The depth of explanation tracks the cost of being wrong.

Maintain human control. The user stays in the loop and keeps the ability to override. GitLab states the intent plainly in its guidance: design AI to be collaborative, not autonomous. AI should suggest and assist while users remain in control.

Design for failure states. Assume the model will be wrong, slow, or empty, and design those moments deliberately. The error state is not an edge case to bolt on. It is a first-class screen.

Four primitives, arrived at separately. That is a benchmark our earlier posts asserted existed in spirit. Now it has names, and the names came from the practitioners, not from us.

The One System With a Gate

Convergence on principles is cheap. Enforcement is rare, and exactly one system in the census ships it.

Microsoft Fluent 2 includes a Responsible AI rubric that issues letter grades and automatic fails. A design that violates a non-negotiable does not get a low score with a note to improve. It fails, and failing is a shippability condition. This is governance turned into product surface: the rubric is not a PDF the team is supposed to remember, it is a gate the work has to pass.

We have called this pattern the difference between a design system that documents good behavior and one that enforces it. Fluent 2 is the public proof that the enforcement end of the ladder is buildable today, not someday. It also shows how lonely that end is. One system, out of 156, treats AI governance as something the work must clear rather than something the team should consider.

The Frontier Is Open

The honest read of this census is not that design governance for AI is behind. It is that design governance for AI is barely started. Seventeen percent participation, a handful of systems above the description line, and a single example of real enforcement. That is the profile of an open frontier, not a maturing practice.

For most organizations this is good news, because it means the work has not been claimed. The leaders have already done the hard part of naming the primitives. Mark content, explain in layers, keep humans in control, design for failure. The primitives are public. What is missing almost everywhere is the gate that turns them from advice into a condition of shipping.

Do This Now

Pull your own design system documentation and run Kavcic’s question against it: does it say anything about AI at all? If the answer is no, you are in the 83%, and your AI features are shipping against a constraint layer that has no opinion. Start at L2. Document the states of AI components, the interaction patterns, and the failure modes, using the four convergent principles as your checklist. Then pick the one or two non-negotiables that should never ship, and wire them into review as automatic fails, the way Fluent 2 does. The leaders gave you the rubric. The only question left is whether your system has teeth.


This analysis synthesizes Design Systems That Document AI (The Design System Guide, Romina Kavcic, June 2026).

Victorino Group helps organizations turn AI design principles into enforced shippability gates instead of documents nobody reads. Let’s talk.

All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation