The AI Control Problem

The Mythical Agent-Month: Why Your AI Strategy Needs Governance, Not More Tokens

TV
Thiago Victorino
12 min read

Wes McKinney published a piece last week that every CTO running parallel Claude Code sessions should read before their next standup. His argument: Fred Brooks’s 1975 observation — that adding more people to a late project makes it later — has a perfect analog in agentic development. Adding more agent sessions to a complex codebase doesn’t produce linear returns. It produces what McKinney calls an “agentic tar pit.”

He’s articulating something that practitioners have felt for months but lacked the vocabulary to name. And the implications extend far beyond software engineering.

The 1/9th Problem Never Left

Brooks observed that a working program represents roughly one-ninth of the effort required to produce a shippable programming product. The other eight-ninths — testing, documentation, hardening, integration, maintenance infrastructure — are what separate a demo from something customers depend on.

Agents have compressed the first ninth to near-zero. A Claude Code session can produce a functional prototype in hours that would have taken weeks. But McKinney’s point is precise: “Many newly-minted AI vibe coders clearly underestimate the work involved with going from prototype to production.”

The eight-ninths hasn’t shrunk. If anything, it’s grown. When the prototype phase is instant, organizations are tempted to skip directly to deployment. The governance layer — review, testing, monitoring — gets treated as optional overhead rather than the majority of the actual work.

This is the oldest mistake in software, now running at machine speed.

The Brownfield Barrier

McKinney identifies something that will resonate with anyone who’s tried agents on real production code: a performance cliff around 100–200 thousand lines of code.

Below that threshold, agents work well. They can hold enough context to make coherent changes. Above it, he reports that “every new change has to hack through the code jungle created by prior agents.” At Posit, where he’s CEO, agents “struggle much more in 1 million-plus line codebases such as Positron, a VSCode fork.”

The mechanism matters more than the threshold. Agents generating code create what McKinney calls “technical debt on an unprecedented scale, accrued at machine speed.” Each generated function carries a future contextual burden — maintenance, debugging, and reasoning overhead for every subsequent agent session. The agents are optimizing locally while degrading the global environment they operate in.

This is Conway’s Law meeting agentic development: when your “team” consists of agents with no persistent memory and no shared system understanding, the architecture reflects that absence. Fragmented. Inconsistent. Locally correct but globally incoherent.

Essential Complexity Doesn’t Care About Your Token Budget

Brooks distinguished between accidental complexity — the friction imposed by tools, languages, and platforms — and essential complexity — the irreducible difficulty inherent in the problem itself.

Agents are extraordinary at accidental complexity. Boilerplate, API wrappers, CRUD scaffolding, test templates — tasks where the solution pattern is well-established and the work is primarily translation. This is genuinely valuable. Years of developer frustration with accidental complexity are being resolved in seconds.

But essential complexity — the architectural decisions with no precedent to pattern-match against, the trade-offs that depend on business context agents don’t have — remains stubbornly resistant. And here’s McKinney’s critical inversion: agents are so effective at eliminating accidental complexity that they generate new accidental complexity in the process. Defensive boilerplate, overwrought abstractions, unnecessary layers of indirection. The essential structure gets buried under machine-generated scaffolding.

When generating code is free, knowing when to say no is your last defense.

Remove the Difficulty, Remove the Product

Sidu Ponnappa extends this logic to its economic conclusion. His thesis: software value has always derived from difficulty of creation. When AI makes building trivial, the economic category of “product” itself collapses.

The framework is sharp. He distinguishes between assets (expensive and risky to build, therefore scarce, therefore amortizable across customers) and inventory (cheap and fast to produce, therefore abundant, therefore not amortizable because customers can self-produce).

An HRMS that took six months and half a million dollars was an asset. The same HRMS built in a weekend for a few hundred dollars in compute is inventory. The “built-in-a-weekend flex,” as Ponnappa puts it, “is also the confession.”

What falls below this line keeps expanding. Boilerplate and autocomplete fell first. Then integrations and dashboards. Now full CRUD applications, portals, internal tools. The line moves only in one direction.

What remains above it? Compilers. State-of-the-art models. Systems encoding genuinely novel algorithms or accumulated domain data — trading systems, tax engines, compliance frameworks. Difficulty here lives in understanding, not code volume.

This should alarm any organization whose AI strategy amounts to “generate more code faster.” Speed of generation is precisely the wrong metric when the value of generated artifacts is converging toward zero.

The Exoskeleton Alternative

If autonomous agents fail on essential complexity and the artifacts they produce are losing product value, what’s the governed alternative?

Kasava’s framing is the most actionable I’ve seen: stop thinking of AI as a coworker and start treating it as an exoskeleton.

Physical exoskeletons don’t lift boxes independently. They let humans lift more, longer, with less injury. Ford deployed them in factories. The military uses them for load-carrying. Rehabilitation medicine uses them for mobility. The pattern is identical: the human directs, the technology multiplies capacity.

The parallel to software is precise. Autonomous agents lack what Kasava calls “implicit organizational context” — the unwritten competitive rationale behind pricing decisions, the deprecation that was agreed on last quarter, which customer segments actually generate margin. These aren’t things you can put in a prompt. They’re distributed organizational knowledge that takes months to acquire and years to develop judgment around.

When autonomous agents fail, you often can’t diagnose where in the pipeline things went wrong. The exoskeleton model — micro-agents with visible seams — solves this. Four principles:

Decompose roles into discrete tasks. Not “can AI do a developer’s job?” but which of the forty-seven weekly subtasks benefit from amplification.

Build focused, reliable micro-agents. One thing done consistently beats broad autonomy done poorly.

Keep humans in the decision loop. AI amplifies execution. Humans retain judgment.

Make the seams visible. Clear inputs and outputs per component enable debugging and maintain trust. When something breaks, you know exactly which component failed and why.

This is governance architecture, not AI architecture. The technology decisions follow the organizational decisions, not the reverse.

Why CEOs’ AI Enthusiasm Isn’t Landing

The disconnect between these realities and executive messaging has become measurable. Axios reported in February 2026 that mentions of “agentic AI,” “AI workforce,” and “digital labor” on corporate earnings calls increased 4,425% from Q4 2023 to Q4 2025. Meanwhile, an NBER study of 6,000 executives found that nearly 90% of firms reported AI had no impact on employment or productivity over the last three years.

Read that again. A 4,425% increase in talking about AI impact. A 90% rate of zero actual impact.

MIT’s research adds another layer: 95% of organizations showed no measurable return on investment from generative AI. ManpowerGroup’s survey of 14,000 workers across 19 countries found that while AI tool usage increased 13%, worker confidence in AI’s utility dropped 18%.

The gap isn’t a communication problem. It’s a substance problem. CEOs are messaging an AI revolution to shareholders while employees experience an AI that makes them correct 37% of its output — per Workday’s own survey. When the investor deck says “efficiency” and the town hall says “empowerment,” employees hear doublespeak. They’re not wrong.

Ethan McCarty, CEO of Integral, put it plainly: “The gap between AI messaging to shareholders and employees isn’t a communications problem — it’s a trust problem.”

Companies with AI people strategies outperformed those using AI solely as a tech strategy by 11.8%, according to Burson’s reputation study. The variable isn’t which model you use. It’s whether you’ve built the human systems — training, governance, accountability, clear workflow integration — that make the technology trustworthy at the operational level.

The Mythical Agent-Month Is Here

Brooks’s insight in 1975 was that software development is fundamentally a communication and coordination problem, not a typing problem. Adding more typists doesn’t help when the bottleneck is understanding.

Fifty years later, the insight holds. Agents have made typing literally free. And the bottleneck is still understanding. Understanding when to say no. Understanding which complexity is essential. Understanding that a prototype is one-ninth of a product. Understanding that speed of generation is meaningless without governance of output.

The mythical agent-month is here. And it looks exactly like the mythical man-month: the belief that parallelism solves problems that are fundamentally sequential, that velocity substitutes for judgment, that more is better when the constraint was never quantity.

Organizations that recognize this will build the governance layer — the micro-agent architectures with visible seams, the human decision loops, the spec-first workflows. They won’t run fewer agents. They’ll run governed agents.

Organizations that don’t will accumulate technical debt at machine speed, generate artifacts of declining economic value, and wonder why their employees don’t share their enthusiasm.

The agents work. The question was never whether they could generate code. The question is whether your organization can govern what they generate.


Sources

  • Wes McKinney. “The Mythical Agent-Month.” wesmckinney.com, February 2026.
  • Sidu Ponnappa. “After AI, There Is No Product.” sidu.in, February 2026.
  • Kasava. “Stop Thinking of AI as a Coworker. It’s an Exoskeleton.” kasava.dev, February 2026.
  • Eleanor Hawkins. “Why CEOs’ AI Hype Isn’t Landing with Employees.” Axios, February 5, 2026.
  • NBER. Survey of 6,000 executives on AI impact. Via Fortune/Yahoo Finance, February 2026.
  • MIT. “The GenAI Divide: State of AI in Business 2025.” MLQ/MIT, 2025.
  • ManpowerGroup. “2026 Global Talent Barometer.” January 2026.
  • Workday. “Beyond Productivity: AI Value.” 2026.
  • Burson. AI reputation and shareholder returns study. January 2026.
  • Fred Brooks. The Mythical Man-Month. Addison-Wesley, 1975.
  • Fred Brooks. “No Silver Bullet.” IEEE Computer, 1986.

At Victorino Group, we help organizations build the governance layer that turns AI capability into reliable engineering outcomes. If your agents are generating faster than your team can govern, let’s talk.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation