- Home
- The Thinking Wire
- A Top Agent's Core Is a While-Loop
An April 2026 arXiv preprint dissected the leaked Claude Code source and published a finding that sounds anticlimactic: the core of a top-tier coding agent is a while-loop. Call the model. Run the tools it requested. Feed the results back. Repeat until the model stops asking for tools.
That’s it. That’s the center.
If you were expecting a clever planner, a graph of specialist sub-agents, or a reinforcement-learned controller, you’re reading the wrong paper. The architectural diagram at the middle of a system good enough to ship production code is something a second-year CS student could sketch on a napkin.
This is the interesting part. Not because simplicity is virtuous for its own sake, but because of where the complexity actually went.
Complexity doesn’t live at the center
When a system is simple at the core, the work moved somewhere else. In agents, it moved to the perimeter.
The perimeter has three parts, and they are the entire governance surface of the product:
Tools. What the agent can do. Read a file. Edit a file. Run a command. Fetch a URL. Each tool is an authorization boundary. A tool definition is a contract: inputs, outputs, side effects, and what the tool refuses to do. Claude Code’s behavior in your repo is a function of which tools are registered and what those tools guard against.
Prompts. What the agent thinks it is and what it thinks it’s doing. The system prompt sets identity. The context window shapes judgment. CLAUDE.md files, loaded before every turn, inject the project’s rules into the model’s working memory. Prompts are where policy lives.
Checkpoints. Where humans get to intervene. Permission prompts before destructive tool calls. The pause before pushing to a branch. The review step before a commit. Checkpoints are the audit surface. Every resumable decision point is a place where a human, a hook, or a policy engine can say no.
Loop, tools, prompts, checkpoints. That’s the whole architecture. The loop is trivial. Everything that matters happens at the edges.
The audit unit is a turn
If the core is a while-loop, the observable contract is the turn. Each iteration is: model input (prompt plus context plus prior tool results), model output (text plus tool calls), tool execution, new context.
This matters because it gives you a unit to audit. You don’t need to reason about emergent behavior across fifteen steps. You need to reason about: what did this turn receive, what did it decide to do, what did the tool return, and would a human reviewing this single turn be comfortable with the choice?
A team that logs turns has everything it needs. Prompt sent, tool calls made, tool results received, next prompt composed. Every governance question about an agent reduces to questions about specific turns. Who authorized this? What did the model see? What did the tool return? Was there a checkpoint that should have fired?
Without turn-level observability, you’re governing a ghost. With it, you’re governing a sequence of discrete decisions, each one inspectable.
What the finding is not
A few caveats, because the “simple at the core” narrative is easy to overextend.
This is an analysis of observable source, not an authoritative architecture document from Anthropic. The authors looked at a leaked TypeScript bundle and described what they saw. That’s a reasonable basis for a paper, but it’s not a spec. Call it what it is.
The finding is also not Claude-specific. Most agent frameworks converge on the tool-use loop pattern because the model itself expects to be called in that shape. LangChain agents, AutoGen agents, OpenAI’s Assistants API, Agents SDK. All of them loop. The paper’s contribution is confirming that a production-grade agent didn’t find a cleverer structure. It doesn’t invent the pattern.
And “simple at the core” doesn’t mean “simple overall.” The tool definitions are elaborate. The prompts are long. The permission system has edge cases. Context management is the actual engineering work: deciding what stays in the window, what gets compacted, what gets dropped. Complexity moved. It didn’t disappear.
We explored a related thread in Claude Code Insights Command, where the telemetry question was: how do you turn agent activity into something humans can review? The while-loop finding is the architectural companion to that question. Turns are what you log. Insights are what you derive from them.
What this means for teams building agents
Three implications, none of them about the loop itself.
Spec your tools like they’re public APIs. They are, effectively. Your agent’s behavior is bounded by the tool surface. A tool with vague permissions is a governance hole. A tool with strict schema and explicit refusal modes is a governance asset. Treat tool definitions as first-class architectural artifacts, versioned and reviewed.
Treat prompts as policy, not copy. System prompts, CLAUDE.md files, tool descriptions. These aren’t documentation. They’re the runtime configuration of your agent’s judgment. Change control applies. A PR that edits a prompt is a PR that changes behavior, and it should be reviewed with the same care as a code change.
Make checkpoints intentional. Permission prompts are not a UX annoyance to be minimized. They’re the human-in-the-loop layer, and they’re the audit trail. When you design an agent workflow, decide deliberately: what requires confirmation, what runs autonomously, what logs for later review. This is the decision that auto mode governance forces explicitly, and that every agent team makes implicitly. Make it explicit.
The source leak discussion reminded us that agent implementations aren’t moats. Architectures are convergent. Everyone has the loop. What distinguishes a well-governed agent from a dangerous one is the discipline at the edges.
The takeaway
If you’re architecting an agent system and you find yourself designing a clever core, stop and look at your perimeter. The center is probably a loop. The product is probably the tools, the prompts, and the checkpoints.
That’s where the real work is. That’s also where the real governance is. They turn out to be the same surface.
This analysis synthesizes Claude Code Architecture Analysis (arXiv preprint, April 2026).
Victorino Group helps teams treat the agent perimeter (tools, prompts, checkpoints) as the governance surface. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation