The Agent Operations Paradox: More Agents, More Work, Less Uptime

Nan Yu, Linear’s Head of Product, said something this month that deserves more scrutiny than it received: “You’ll have AI team members that you can assign tasks to and talk to just like how you talk to people.”

Linear now reports that agents generate the majority of tickets on the platform. Simple bugs and small features get assigned directly to AI coding agents. Engineers reserve complex work for themselves, using Claude Code with Linear’s MCP integration. The vision is organizational: agents as colleagues, not tools.

Ramp’s CPO Geoff Charles pushed it further: “If you’re not using Claude Code, no matter what your role is, you’re probably underperforming.” Ramp claims over 500 shipped features through their AI proficiency system. Charles’s summary: “Your job is to automate your job.”

The framing is seductive. Agents are team members. Assign them work. Talk to them like people. Watch the output multiply.

Here is what nobody in that conversation mentioned: the companies building these agents cannot keep their own infrastructure running at three-nines.

The Uptime Problem

Lorin Hochstein analyzed the uptime records of OpenAI and Anthropic this month. OpenAI’s ChatGPT runs at 98.86% uptime. That sounds high until you do the math. Three-nines (99.9%) allows 8.7 hours of downtime per year. OpenAI’s number allows roughly 100 hours. Most services at both companies fail to reach the three-nines threshold. Only Sora, of all things, manages 99.9%.

Boris Cherny from Anthropic offered context: “This is what hypergrowth looks like…10x y/y growth ain’t…” Both companies are experiencing 10x year-over-year growth. Hypergrowth explains the reliability problems. It does not solve them.

Now connect the dots. Linear wants you to treat agents as team members. Those agents depend on API infrastructure that delivers sub-three-nines reliability. A human team member who showed up 98.86% of the time would miss roughly six full workdays per year unannounced. You would not call that employee reliable. You would start managing them out.

As we examined in The Operations Tax, the cost of operating AI agents in production compounds in ways that demos never reveal. Uptime is one of those costs. When your agent workforce depends on external APIs with consumer-grade reliability, your operations team inherits that reliability profile. Every agent task that fails because an API is down creates cascading work: retry logic, state recovery, failed-task triage, customer communication about delays.

The obvious response is redundancy. Route through multiple providers. Build fallback chains. But redundancy for AI agents is not like redundancy for databases. Different models produce different outputs. A task started on Claude and finished on GPT may produce incoherent results. Provider failover for AI agents is an unsolved operations problem that most teams have not even started thinking about.

The Cowan Paradox

Ruth Schwartz Cowan published “More Work for Mother” in 1983. Her thesis: the vacuum cleaner did not reduce housework. It raised cleanliness standards. Tasks that were done monthly became weekly. Surfaces that were acceptable became unacceptable. The labor-saving technology created more labor.

Jason Lemkin at SaaStr named the AI version of this the Cowan Paradox. A Berkeley Haas field study tracked workers over eight months as they adopted AI tools. The workers did adopt the tools. They also expanded into adjacent roles, stopped separating work hours from non-work hours, and ended up busier than before. The technology saved time on individual tasks. The organization consumed that time by raising expectations.

SaaStr’s Chief AI Officer, Amelia, provides a concrete case. She spends one hour every morning managing 20+ AI agents. That is the same time she would spend managing two human reports. The output is 10x. But the management overhead is real and daily.

The difference between the vacuum cleaner and AI agents is speed. Cowan documented a phenomenon that took decades to play out. AI agents are resetting expectations in months. An engineering team that ships 500 features through AI proficiency levels (Ramp’s number) does not then settle into a comfortable rhythm. The bar moves. Next quarter, the expectation is 700. The quarter after that, 1,000.

This is the paradox. Each agent you add multiplies output. Multiplied output raises expectations. Raised expectations demand more agents. More agents demand more operations work. The person managing 20 agents is not doing less work than the person who managed two humans. They are doing different work, at higher volume, with less tolerance for downtime.

The Three-Body Problem of Agent Operations

These three forces interact in ways that are worse than any one alone.

Force 1: Agents as team members. Linear’s model assigns agents work the same way you assign humans work. This means agents need the same operational infrastructure as humans: task queues, status tracking, escalation paths, performance monitoring. Except agents fail differently than humans. A human who gets stuck asks for help. An agent that gets stuck either retries silently, produces wrong output confidently, or stalls without notification.

Force 2: Infrastructure unreliability. The APIs these agents depend on deliver consumer-grade uptime. When your agent team member goes offline because Anthropic is having an incident, there is no standup where the agent explains what happened. There is a gap in your ticket queue, a half-finished task with unclear state, and an engineer who now needs to figure out what was done and what was not.

Force 3: Rising expectations. Every efficiency gain from agents gets absorbed by expanded scope. The team that used to ship 50 features per quarter now ships 200. The operations infrastructure that supported 50 features does not support 200. But nobody budgets for 4x operations capacity when they budget for 4x output.

The interaction between these forces is what makes agent operations genuinely hard. It is not any single problem. It is that solving one problem (adding more agents for throughput) amplifies the other two (more infrastructure dependencies, higher expectations). As we explored in From In-the-Loop to On-the-Loop, the teams that succeed are the ones building systems around agents rather than using agents as faster typists. But even that framing underestimates the operational load.

What Operations Discipline Looks Like

The companies that will manage this well are the ones that treat agent operations with the same rigor they apply to human team management. Not metaphorically. Literally.

SLAs for agent work, not just API calls. Monitoring API uptime is necessary but insufficient. You need SLAs for task completion: what percentage of assigned tasks finish successfully within the expected timeframe? What is the retry rate? What is the state-recovery success rate after an infrastructure failure? These are operational metrics that most teams do not track because they are still thinking of agents as tools, not workers.

Capacity planning that accounts for the Cowan Paradox. If you deploy 20 agents and output triples, your operations team needs to triple. Not next year. Now. The expectation ratchet moves faster than hiring cycles. Plan for it before the first agent ships, or accept that you will spend six months in operational debt.

Graceful degradation for the workforce, not just the service. When an API goes down, what happens to the 15 tasks that were in progress? SRE teams have runbooks for service degradation. Almost nobody has runbooks for workforce degradation: how to triage half-completed agent work, how to reassign tasks to humans or alternative agents, how to verify the integrity of partial outputs.

As we discussed in Agent Teams and the Shift from Writing Code to Directing Work, environment design determines agent output quality. That principle extends to operations. The operational environment (monitoring, alerting, task recovery, capacity planning) determines whether your agent workforce is reliable or chaotic.

Incident management that includes agent failure modes. Human incidents have a familiar shape: someone made a mistake, a system broke, a process failed. Agent incidents have unfamiliar shapes: the agent completed the task successfully but the output was wrong. The agent retried a failed API call 47 times and ran up a $2,000 bill. The agent claimed a task, made partial progress, stalled during an outage, and left the system in an inconsistent state that a second agent then compounded. Your incident management process needs categories for these failure modes, or they will be classified as “weird things that happened” and never produce systemic fixes.

The Operating Model Question

Geoff Charles’s quote is correct in one direction: your job is to automate your job. What he did not say is that automating your job creates a new job. The new job is operating the automation.

This is not a reason to avoid agents. The productivity gains are real. Ramp’s 500 features are real. Linear’s agent-driven ticket management is real. The output multiplier is not marketing fiction.

But the operating model that supports 5 agents is different from the one that supports 50, which is different from the one that supports 500. Each order of magnitude requires new infrastructure, new processes, new roles, and new metrics. The organizations that treat agent deployment as a tool adoption will hit the paradox hard: more agents, more work, less uptime, and nobody budgeted for any of it.

The organizations that treat it as an operations transformation will have a different experience. Not an easy one. But a manageable one.

This analysis synthesizes Peter Yang’s report on Linear’s agent-as-teammate model (March 2026), Lorin Hochstein’s analysis of AI company uptime (March 2026), and Jason Lemkin’s coverage of the Cowan Paradox at SaaStr (March 2026), with Ruth Schwartz Cowan’s “More Work for Mother” (1983) and a Berkeley Haas field study on AI adoption and work expansion.

Victorino Group helps organizations build operations infrastructure for AI agent workforces before the paradox compounds. Let’s talk.