The AI Control Problem

The Speed Trap: Why Faster AI Coding Makes Your Delivery Slower

TV
Thiago Victorino
9 min read
The Speed Trap: Why Faster AI Coding Makes Your Delivery Slower

Edsger Dijkstra wrote it decades ago: “Lines of code should not be regarded as ‘lines produced’ but as ‘lines spent.’” Bill Gates put it differently: “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”

Both men were talking about the same error. We are making it again, at scale, with AI.

Three independent sources published in March 2026 converge on a single diagnosis. The anonymous author at Antifound. Andrew Murphy. Justin Jackson. None of them cite each other. All three arrive at the same conclusion: optimizing the speed of code generation does not improve software delivery. In many cases, it makes delivery worse.

This is not an abstract concern. It is a systems dynamics problem with a name, a body of theory, and decades of evidence behind it. Eliyahu Goldratt called it the Theory of Constraints. The principle is simple: a system’s throughput is governed by its bottleneck. Optimizing anything that is not the bottleneck does not improve throughput. It increases inventory.

In software delivery, code writing is not the bottleneck.

The 20/80 Split

Murphy frames it with a question that should be familiar to any engineering leader: if a feature takes two months to ship, and the actual coding takes an afternoon, where did the other 59 days go?

The answer is queues. Code review queues. QA queues. Staging environment queues. Compliance approval queues. Deployment window queues. Stakeholder feedback queues. The code sits, waiting, while humans coordinate around it.

Murphy’s estimate is that writing code accounts for roughly 20% of the delivery lifecycle. The remaining 80% is waiting, reviewing, coordinating, and resolving ambiguity. These numbers will vary by organization, but the ratio is consistent with what Goldratt observed in manufacturing decades ago: the constraint is almost never the production step everyone focuses on.

As we documented in Measuring AI in Software Development, Jellyfish found that organizations moving to full AI adoption saw 113% more pull requests per developer. But more PRs do not mean more shipped features. They mean more items entering the queue. If the queue cannot absorb them, the system degrades.

What Happens When You Speed Up the Wrong Step

Goldratt’s theory makes a specific prediction: when you optimize a non-bottleneck, you do not get a faster system. You get a more congested one.

Murphy describes exactly this pattern. Developers armed with AI coding tools produce more code, faster. PRs multiply. But the review process has not accelerated. The QA process has not accelerated. The deployment pipeline has not accelerated. So the work piles up between stations.

The Antifound author spends 60% of working time on planning, design, and research. Code generation, even with heavy LLM usage, consumes the remaining fraction. And 75% of that LLM-assisted code requires significant editing before it is production-ready. The speed of generation is largely irrelevant when the generation step is already the smallest piece of the timeline.

This is the speed trap. It feels productive. Developers report feeling faster. (The METR study we covered in our measurement analysis found developers perceived a 20% speedup while actually being 19% slower.) But the system is not faster. The system has more work-in-progress, longer queues, and less visibility into what actually matters.

Murphy summarizes it bluntly: “You didn’t speed anything up. You created a traffic jam and called it productivity.”

The Inventory Problem

In manufacturing, inventory between stations is visible. Boxes stack up on the factory floor. Someone notices.

In software, inventory is invisible. It lives in pull request queues, Jira boards, and Slack threads. Nobody walks past a pile of unreviewed PRs and thinks “we have an inventory problem.” They think “the team is busy.”

When AI-assisted developers produce more code, this invisible inventory grows. Murphy identifies three consequences.

First, context switching increases. More PRs in review means more interruptions for reviewers. Each context switch carries a cognitive cost. The reviewer’s throughput drops as the queue grows.

Second, merge conflicts multiply. Code that sits in queue diverges from main. The longer it waits, the more likely it conflicts with other queued work. Resolving conflicts is rework that would not exist if the queue were shorter.

Third, feedback loops lengthen. A developer who submits a PR on Monday and gets review feedback on Thursday has lost three days of context. They have moved on to other work. Returning to the original PR requires reloading context, which is slow and error-prone. As we explored in Cognitive Debt: The Invisible Cost, this cognitive overhead compounds across teams.

All three consequences reduce effective throughput. The system produces more code and delivers less value.

More Code, Less Understanding

The Antifound author raises a concern that cuts deeper than queue theory. When developers generate code with LLMs, they often understand it less than code they wrote manually.

This is not speculation. The author documents spending 75% of LLM interactions on code generation, with heavy editing required afterward. The editing is where understanding develops. But when the volume of generated code exceeds the developer’s capacity to edit carefully, understanding degrades.

Murphy connects this to a systemic risk: “More code, less understanding. That’s not a productivity gain. That’s a time bomb with a nicer dashboard.”

The verification cost does not disappear because the generation was fast. As we covered in Cheap Code, Expensive Quality, generating code dropped to near-free while verifying code remained expensive. Faster generation without faster verification means more unverified code in the system. The AI Verification Debt research shows 96% of developers distrust AI-generated code, with 48% actively verifying each output. That verification work is manual, slow, and does not scale with generation speed.

The Role Confusion Multiplier

Justin Jackson adds another dimension to the speed trap. When AI makes coding accessible to non-engineers, the queue problem compounds.

Product managers write prototypes. Designers ship components. Everyone produces code because the barrier dropped. But the review, integration, and deployment processes were not designed for this volume or this variety of contributors. A PM’s prototype enters the same review queue as an engineer’s feature branch, consuming the same scarce reviewer attention.

As we analyzed in The Mexican Standoff, this role expansion creates coordination overhead that offsets individual productivity gains. Kent Beck’s observation applies here too: when the value of implementation skills drops to near zero, the bottleneck shifts entirely to judgment, coordination, and system design. These are the skills that remain scarce, and flooding the system with more generated code does nothing to expand their supply.

What Goldratt Would Tell Your Engineering Team

If Goldratt were consulting for a software organization adopting AI coding tools, his advice would be predictable because his framework is deterministic.

Step 1: Identify the constraint. Map your delivery pipeline from feature request to production deployment. Measure where work waits longest. In most organizations, the constraint is not coding. It is review, testing, or deployment approval.

Step 2: Exploit the constraint. Before adding capacity, ensure the bottleneck is running at maximum efficiency. If code review is the constraint, are reviewers spending 100% of their review time actually reviewing? Or are they in meetings, writing their own code, responding to Slack? Protect the bottleneck’s time.

Step 3: Subordinate everything else to the constraint. This is the counterintuitive step. It means deliberately slowing down non-bottleneck steps to match the bottleneck’s capacity. If your review process can handle 10 PRs per day, producing 25 PRs per day is not productive. It is wasteful. Limit work-in-progress to what the system can absorb.

Step 4: Elevate the constraint. Only after steps 1 through 3, invest in expanding the bottleneck. This might mean AI-assisted code review, automated testing, or streamlined deployment approvals. Apply AI to the constraint, not to the step that was already fast enough.

Step 5: Repeat. Once you relieve one bottleneck, another appears. The system’s throughput is always governed by its current constraint, wherever that happens to be.

Most organizations adopting AI coding tools have jumped directly to accelerating Step 0 (code writing) without performing Step 1 (identifying where the actual constraint lives). The result is predictable.

The Metric That Matters

If lines of code is the wrong metric, what is the right one?

Cycle time. The elapsed time from “work started” to “value delivered to users.” Not PR creation time. Not coding time. End-to-end cycle time.

An organization where AI coding tools reduce cycle time is genuinely more productive. An organization where AI coding tools increase PR volume while cycle time stays flat (or grows) has fallen into the speed trap.

Murphy’s challenge is direct: measure how long a feature takes from conception to production. If the answer is still two months, your AI coding tools have not improved delivery. They have improved one afternoon within a two-month process.

The Antifound author’s time allocation tells the same story from the individual perspective. When 60% of your time goes to planning and design, accelerating the remaining 40% by even 50% saves you 20% of total time at best. And that assumes the generated code needs no editing, which it does.

The Uncomfortable Conclusion

The three sources converge on a conclusion that AI tool vendors will not put in their marketing materials.

Faster code generation, applied to a system where code writing is not the bottleneck, makes delivery worse. Not the same. Worse. Because it increases inventory, lengthens queues, multiplies context switches, and creates the illusion of productivity where none exists.

The organizations that will benefit most from AI coding tools are the ones that first understand their delivery system well enough to know where the constraint actually lives. Then apply AI there.

Dijkstra was right. Lines of code are lines spent, not lines produced. Spending them faster, without understanding what you are spending them on, is not progress. It is acceleration in the wrong direction.


This analysis synthesizes Codegen Is Not Productivity (March 2026), If You Thought Speed Was Your Problem (March 2026), and Will Claude Code Ruin Our Team? (March 2026).

Victorino Group helps enterprises measure what actually matters in AI-assisted development, not lines of code, but delivery outcomes. Let’s talk.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation