- Home
- The Thinking Wire
- Who Actually Decides When You Build With a Coding Agent
Who Actually Decides When You Build With a Coding Agent
Humans make roughly 70% of the planning decisions in a coding-agent session, and the agent makes roughly 80% of the execution decisions. That split comes from Anthropic’s analysis of about 400,000 Claude Code sessions across 235,000 users and 23 occupational groups, run between October 2025 and April 2026. It is the first large-N, first-party measurement of who decides what when a person builds software with an agent. The headline is not that the agent writes the code. Everyone knew that. The headline is that the decision boundary is sharp, stable, and falls exactly where accountability should: the human owns the what, the agent owns the how.
We have argued before that domain expertise is the control variable nobody measures and that the verification layer is what separates output from competence. Those were claims. Anthropic just put numbers behind them at a scale no single team could reproduce.
The 70/30 Split Is a Job Description
“People decide what to build, and the agent decides how to build it.” That sentence from Anthropic Economic Research reads like a slogan until you see it backed by 400,000 sessions. Planning decisions: scope, sequencing, what problem to solve, what tradeoff to accept, when the work is done. Humans drive about 70% of them. Execution decisions: which function signature, which library call, how to structure the loop, which edge case to handle inline. The agent drives about 80% of those.
Read that as an org chart. The human is the director of execution, not its passenger. The agent is a fast, capable engineer who needs a clear brief and an accountable reviewer. Neither role is optional. A session where the human abdicates planning produces well-built software solving the wrong problem. A session where the human micromanages execution wastes the one thing the agent is actually good at.
The most useful reframe here is operational. If you are staffing a team to build with agents, you are not hiring for typing speed. You are hiring for the 70%: people who can decide what to build and recognize when it is built correctly. That is a different interview, a different scorecard, and a different definition of seniority than most engineering orgs run today.
Expertise Predicts Success, Coding Skill Does Not
The finding that should rearrange hiring plans: every major occupation lands within 7 percentage points of software engineers on success rate. A financial analyst, a biologist, a marketer, a lawyer building a tool in Claude Code succeeds at a rate close to a trained software engineer doing the same. The variable that moves the number is not whether you can code. It is whether you understand the problem.
Anthropic states it directly: “Success is determined by how well a person understands the problem they are trying to solve, not whether they’re trained in coding.” The data underneath has texture. Verified success runs at 15% for novices and 28% to 33% for intermediates and experts. Partial success climbs from 77% for novices to 91% or 92% for experts. The expert advantage is real and it compounds, but it tracks domain command, not syntax fluency.
This is why the expertise-tax argument we made about spatial biology generalizes. The tax is the same in every field. A non-coder with deep domain knowledge clears the bar. A strong coder working outside their domain does not get a free pass. The agent collapses the cost of writing code toward zero, which exposes the cost that was always the real bottleneck: knowing what correct looks like.
The Behavior Changes With Mastery
Novice sessions average 5 Claude actions per prompt. Expert sessions average 12 or more. The expert is not being more cautious. The expert is handing the agent larger, better-specified units of work and trusting it to run further before checking in. That is delegation calibrated by judgment, and it only comes from knowing the problem well enough to write a brief the agent can execute without a leash.
The trend lines over the study window tell the rest. Debugging sessions fell from 33% to 19%. Task value rose about 25% over seven months. Users moved up the value chain: less time fixing what the agent broke, more time directing it at work that mattered. That shift did not come from the model alone. It came from users learning where the 70/30 line sits and stopping their fight with it.
What This Does Not Prove
Anthropic is candid about the limits, and so should we be. The classifiers that produced these numbers cannot validate real-world outcomes, and the team cannot verify classifier accuracy at full scale. The 70/30 split is measured decision-attribution, not proven causation. “Expertise predicts success” is a strong correlation in a very large sample, not a controlled trial. Treat the numbers as the best map we have of agent-assisted work, drawn from the largest first-party dataset published to date, and treat the direction as more trustworthy than any single decimal point.
That caveat does not soften the strategic read. Even discounted for classifier noise, a finding that holds across 23 occupations and 235,000 users is not an artifact. The boundary is there.
Do This Now
Audit one team’s agent work against the 70/30 line. Pick a recent project built with a coding agent and ask two questions. First: who made the planning calls, scope, sequencing, definition of done? If the answer is “the agent decided and we accepted it,” your accountability is inverted and your blast radius is larger than you think. Second: did the people directing the agent have real command of the problem domain, or just command of the tooling? If you staffed on tooling fluency, you optimized for the 20% the agent already owns and underbought the 70% that decides whether the output is correct.
Then change one thing in your next hire or staffing decision. Weight domain command over coding pedigree for the director role, and make someone explicitly accountable for the planning 70%. The agent will handle the execution 80%. Your job is to make sure a human who understands the problem is holding the other end.
This analysis synthesizes Agentic Coding and Persistent Returns to Expertise (Anthropic Economic Research, June 2026).
Victorino Group helps teams turn agent adoption into measured, accountable delivery. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation