Thoughtworks Discovered Governance-as-Code. We Have Been Building It.

Rahul Garg, a Principal Engineer at Thoughtworks, published the final installment of a four-part series on Martin Fowler’s site last week. The piece is called “Encoding Team Standards.” The thesis: stop treating AI instructions as personal productivity hacks. Put them in version control. Make them team infrastructure.

He is right. And he is late.

What Garg Gets Right

The core observation is sound. Two developers on the same team, using the same AI tool, produce materially different code. Not because the tool is inconsistent but because one developer’s prompts carry fifteen years of accumulated judgment and the other’s do not.

Research confirms the magnitude. A December 2024 study (arXiv 2412.20545) found that LLM output accuracy varies by up to 76 percentage points based on prompt formatting alone. Not prompt content. Formatting. A separate study (arXiv 2508.03678) showed that novice programmers systematically omit critical context that experts include instinctively. The consistency problem is not a people problem. It is a systems problem.

Garg’s solution: extract tacit knowledge from senior engineers through structured interviews, codify it into versioned instruction files, and commit those files to the team repository. Four elements per instruction: role definition, context requirements, categorized standards (critical versus advisory), and output format.

This is a legitimate contribution. It gives teams a repeatable method for turning informal expertise into shared infrastructure.

What Garg Misses

The problems start at the boundary of his framing.

No evidence, only assertion. The entire piece is anecdotal. No before/after metrics. No defect-rate comparisons. No productivity measurements. For a publication in the Fowler ecosystem, which built its reputation on evidence-informed practice, this is a notable absence. Garg asks teams to invest significant effort (structured interviews with every senior engineer, iterative codification, PR-based review of instruction files) without showing that the investment pays off.

No feedback loop. The pattern flows in one direction: extract knowledge, encode it, apply it. What happens when encoded instructions produce bad outputs? How does the team detect that a standard, once committed, has become counterproductive? Garg’s model has no mechanism for this. It assumes encoded knowledge stays correct. Anyone who has maintained a style guide for more than six months knows that assumption fails.

No organizational politics. Encoding team standards means deciding whose standards get encoded. When two senior engineers disagree on security severity thresholds (Garg actually surfaces this scenario), the article treats it as a happy accident: “hidden disagreements made visible.” It does not address who resolves those disagreements or how. Standard-setting is a political act. Pretending otherwise does not make the politics disappear.

No multi-tool reality. Teams use Claude Code, Cursor, Copilot, and others simultaneously. Each tool has different instruction formats, different context windows, different behavior. A versioned instruction file that works perfectly in one tool may produce different results in another. The article treats “AI instructions” as a single category. Production teams know better.

Developer Tooling or Organizational Governance?

Here is where the framing matters.

Garg’s article lives in the developer tooling category. His audience is engineering teams. His examples are code generation, refactoring, security review. His solution is files in a git repository.

We have been working on the same problem from a different altitude. As we explored in CLAUDE.md: The Instruction Manual for Your Code Assistant, versioned instruction files are not new. We wrote the practical implementation guide months ago. The question that matters now is not “should you version your AI instructions?” (yes, obviously) but “what kind of organizational asset are you building when you do?”

A prompt on a developer’s machine is a productivity hack. A prompt in a team repository is a standard. A standard that applies across teams, tools, and business functions is governance infrastructure.

Garg stops at the second step. We are building the third.

The Pattern Extends Beyond Engineering

The evidence that this is a governance problem, not a tooling problem, comes from watching the same pattern emerge outside engineering.

Marketing teams face the identical consistency challenge. As we documented in Your Style Guide Is a Governance Layer, brand voice instructions for AI tools serve the same function as Garg’s team standards: codified judgment, versioned, applied at generation time. The structural pattern is identical. The domain is completely different.

The same applies to legal review, design systems, sales enablement. Every function that uses AI to generate output needs versioned standards that encode organizational judgment. Calling this “developer tooling” constrains the solution to one department when the problem spans the entire organization.

Generation-Time Is the Right Insight, Incomplete

Garg makes one claim worth amplifying: catching misalignment at generation time is higher leverage than catching it at review time. If the AI produces code that already conforms to your standards, the review burden drops.

This is correct. And as we argued in You Are Not Killing Code Review. You Are Renaming Governance, generation-time standards do not replace review-time governance. They add a layer. The Swiss cheese model applies: every enforcement layer has holes. Generation-time instructions reduce defects. They do not eliminate them. Teams that treat instruction files as sufficient (rather than as one layer in a multi-layer system) will learn this the hard way.

The real frontier is not “encode standards into AI instructions.” That is table stakes. The frontier is building systems where standards are enforced at generation, verified at review, monitored in production, and updated based on what monitoring reveals. Garg covers one layer of four.

What Thoughtworks Calls “Encoding,” We Call Governance-as-Code

Garg positions this pattern as novel. The Thoughtworks Looking Glass report for 2026 uses the term “computational governance.” But the underlying principle has existed in engineering for decades. Linting rules, CI pipelines, pre-commit hooks: these are all encoded team standards. The extension to AI instruction files is valid. It is not revolutionary.

What is new, and what the Thoughtworks framing does not capture, is the organizational scope. When you encode standards for AI agents that operate across functions, you are not configuring a tool. You are building a governance layer. One that needs ownership, review cycles, deprecation policies, and cross-functional alignment.

This is what we have been building and documenting. Not instruction files for individual tools but governance infrastructure that treats AI consistency as a system property. The difference matters because system properties survive personnel changes, tool migrations, and organizational restructuring. Personal productivity hacks do not.

Where This Goes Next

Garg’s series promises a fifth pattern called “Feedback Flywheel.” If it addresses the missing feedback loop, the series will be stronger for it. But the fundamental limitation will remain: framing governance as developer tooling limits who builds it, who owns it, and how far it reaches.

The organizations that get this right will not have better prompt files. They will have governance infrastructure that makes AI output consistent across teams, tools, and business functions. Not because individual engineers wrote good instructions, but because the organization invested in standards as a system.

One approach scales with headcount. The other scales with architecture.

This analysis synthesizes Encoding Team Standards (March 2026), The Impact of Prompt Programming on Function-Level Code Generation (December 2024), and Thoughtworks Looking Glass 2026 (2026).

Victorino Group helps organizations build governance infrastructure that makes AI consistency a system property, not a staffing dependency. Let’s talk.