Agents Are Not Tools — But Sometimes They Should Be

A user asks a trip planner to book five days in London for under $4,000. The system queries a flight agent, a hotel agent, and a car rental agent. The flight agent asks what departure city. The hotel agent needs dates confirmed before it can search. The car rental agent pushes back: “Do you really want a car in London?” Each sub-agent is negotiating, clarifying, and sometimes disagreeing with the orchestrator. No single request produces a single response. The action does not complete upon return.

Now consider a weather lookup. You send coordinates. You get a forecast. Done.

Both involve an LLM. Both are called by another system. But they are architecturally different in ways that matter for every decision you make about how to build agentic software.

The Argument

Philip Stephens, writing on the Google Developer Forum, made a sharp observation that deserves wider attention: agents are not tools, and treating them as interchangeable causes real engineering problems.

His distinction is clean.

Tools follow a defined temporal sequence: request, action, completion or error. Inputs are structured. Outputs are structured. The domain is bounded. The execution is time-boxed. When you call a tool, it finishes.

Agents handle incomplete states, changing requirements, and multi-turn interactions. When you call an agent, the action is not guaranteed to be completed upon return. The agent might need clarification. It might have partially completed some sub-tasks but not others. It might return a state that requires the caller to make a decision before proceeding.

Stephens draws an analogy to GOTO statements in programming. Just as GOTO disrupts expected execution flow, agent-to-agent communication disrupts the clean request-response pattern that tools provide. Modern programming did not eliminate GOTO entirely---it contained it within specific constructs like loop breaks and exception handlers. The same discipline should apply to agent interfaces: use them where the problem demands them, not as a default.

“The tool interface,” Stephens argues, “is a degenerative case of the agent interface.” Agents should only become tools when single-turn completion-or-error scenarios suffice.

This Is Not Theoretical

Two protocols now embody this distinction at the infrastructure level.

Anthropic’s Model Context Protocol (MCP), released in November 2024, defines how agents connect to tools and data sources. It is vertical: agent connects down to capabilities. Think of it as a standardized way for an agent to call a weather API, query a database, or read a file system. Structured inputs, structured outputs, bounded execution.

Google’s Agent-to-Agent protocol (A2A), launched in April 2025 with over 50 partners, defines how agents communicate with each other. It is horizontal: agent connects sideways to peer agents. A2A accommodates the messy reality that Stephens describes---incomplete states, multi-turn negotiation, asynchronous resolution.

Both protocols were donated to the Linux Foundation’s Agentic AI Foundation in December 2025, signaling that the industry considers this distinction foundational enough to standardize.

The adoption numbers tell a story about where the industry’s attention has gone. MCP grew from 100,000 downloads at launch to 97 million monthly SDK downloads by late 2025. A2A launched with strong partnership support but saw slower adoption. IBM introduced its own Agent Communication Protocol (ACP) for lightweight agent messaging, adding a third option to the mix.

Meanwhile, Gartner reported a 1,445% surge in multi-agent system inquiries from Q1 2024 to Q2 2025. Databricks measured 327% growth in multi-agent workflows on its platform.

The demand is real. The architectural confusion is also real.

Where the Debate Gets It Wrong

The “agents are not tools” argument is correct but incomplete.

The strict separation assumes you always know, at design time, which pattern a given capability requires. In practice, you often do not. And even when you do, the answer changes based on context.

Consider a code review agent. In one workflow, it operates as a tool: you send it a diff, it returns comments, done. In another workflow, it operates as an agent: it reviews the diff, finds an issue that requires understanding the broader architecture, asks the orchestrator for additional context, receives it, revises its assessment, and flags a design concern that was not in the original scope. Same capability. Different interaction patterns. The difference is not in what the agent can do---it is in what the situation demands.

The trip planner example makes this visible. The flight booking uses tool interfaces for search queries. The constraint negotiation between planner and sub-agents uses agent interfaces. Within a single system, both patterns coexist. The architectural skill is not choosing one over the other. It is knowing where each boundary falls.

The Real Question

The useful question is not “is this an agent or a tool?” It is: does your system know the difference?

A well-designed agentic architecture can handle both modes and degrade gracefully between them. When a task is straightforward---structured input, predictable output, bounded execution---treat the capability as a tool. Get the reliability, composability, and predictability benefits that tool interfaces provide. When a task involves negotiation, partial completion, evolving requirements, or multi-turn reasoning---use agent-to-agent communication. Accept the complexity because the problem demands it.

The failure mode is not picking the wrong pattern for a given interaction. The failure mode is building a system that cannot distinguish between the two.

We see this in production. Organizations that model everything as tool calls hit a ceiling when their workflows require genuine collaboration between capabilities. Organizations that model everything as agent-to-agent communication drown in complexity for tasks that should be simple function calls. The systems that work are the ones that support both patterns and make the boundary explicit.

Three Design Principles

For engineering leaders building agentic systems, this distinction suggests three principles:

1. Default to Tools, Escalate to Agents

Start with the simpler abstraction. Tool interfaces are easier to test, easier to monitor, easier to reason about, and easier to debug. Only introduce agent-to-agent communication when the interaction genuinely requires it---when completion is not guaranteed, when the sub-system needs to push back or ask questions, when the task scope may evolve during execution.

This is not a philosophical preference. It is an operational one. Every agent-to-agent boundary you introduce adds latency, state management complexity, and failure modes. The cost is worth paying when the problem demands it. It is waste when the problem does not.

2. Make the Boundary Explicit

Your system should know, at the protocol level, whether it is making a tool call or initiating an agent interaction. This is what MCP and A2A provide: distinct protocols for distinct interaction patterns. If you are building custom infrastructure, the same principle applies. The interface contract for “call this and get a result” should be different from the interface contract for “start a conversation with this capability.”

When the boundary is implicit---when tool calls and agent interactions use the same interface and the difference is just “sometimes it takes longer”---you lose the ability to set appropriate timeouts, retry policies, state management strategies, and user expectations.

3. Design for Graceful Degradation

The most resilient agentic systems can downgrade an agent interaction to a tool call when conditions require it. If the code review agent’s broader architectural analysis is timing out, fall back to the simpler diff-comment pattern. Return what you have. Indicate what was not completed. Let the orchestrator decide whether to retry at the agent level or accept the tool-level result.

This is the same principle that makes distributed systems reliable: design for partial failure, not just success.

What We Have Learned Running Agents in Production

At Victorino Group, we operate a multi-agent team for our own consulting operations. A CEO agent coordinates strategy. A CTO agent handles technical decisions. Specialized agents manage marketing, content, and client work. We have lived with this architecture long enough to have opinions backed by experience rather than theory.

The lesson that took the longest to learn: the agents-vs-tools distinction is not a property of the capability. It is a property of the interaction. The same agent can and should operate in both modes depending on what the situation requires. Our CTO agent sometimes acts as a tool---answering a direct technical question with a direct answer. Other times it acts as an agent---pushing back on a proposed architecture, requesting additional context, or escalating a concern that changes the scope of the discussion.

The system knows the difference because we designed it to. That architectural awareness is what makes the whole thing work.

The Bottom Line

Philip Stephens is right that agents are not tools. The temporal patterns, the state management requirements, and the interface contracts are fundamentally different. The industry’s development of separate protocols for each pattern---MCP for tool integration, A2A for agent collaboration---validates this distinction at the infrastructure level.

But the practical insight goes one step further. In production systems, the same capability often needs to operate in both modes. The architectural skill is not picking a side in the agents-vs-tools debate. It is building systems that handle both patterns, make the boundary explicit, and degrade gracefully when conditions demand it.

The question is not whether your agents are tools. The question is whether your architecture knows when they should be.

Sources: Philip Stephens, “Agents are Not Tools,” Google Developer Forum (June 2025). MCP adoption data from Anthropic and community metrics. Gartner multi-agent inquiry data (2024-2025). Databricks multi-agent growth metrics (2025). Linux Foundation Agentic AI Foundation announcement (December 2025).

At Victorino Group, we help engineering teams design and govern agentic architectures that work in production---including knowing when to use agents, when to use tools, and when the same system needs both. If that is what you are building, reach out at contact@victorinollc.com.