Agents Don't Do Standups: PFF and the Org Inversion

Software engineering management spent twenty years optimizing for engineer speed. Scrum, sprint planning, daily standups, refinement, retrospectives. Every ceremony descends from one premise: developer hours are the scarce resource, so coordinate them carefully.

At the AI Engineer Conference in May 2026, Mike Spitz, CTO of Pro Football Focus, walked through a three-month experiment that tested what happens when that premise is no longer true. Two engineers, working with agents, against a team of roughly ten engineers working without them. Same codebase, same customers, January through March 2026. Self-reported headline numbers: 25x deploy frequency, 10x output by blended ticket count weighted by code complexity, average customer satisfaction of 8.6 against a pre-AI baseline near 7.5.

PFF is not a research lab. It is a sports data company with 100 million page views a year, nine million fantasy drafts a year, 200 employees and about 20 engineers, serving NFL and NCAA teams alongside a consumer fantasy and betting product. The case study lands at scale, on production code, with paying customers. That is what makes it interesting.

The interesting question is not whether two engineers can replace ten. We have written before about Carlini’s 16-agent compiler experiment and what it implies about agents-as-workforce. The interesting question is what the surrounding organization looks like when you stop optimizing for engineer ergonomics and start optimizing for agent throughput. Spitz’s answer: the ceremonies collapse first.

The Ceremonies Were Solving for a Constraint That Vanished

Scrum was not handed down from a mountain. It is an artifact, designed in the late 1990s and 2000s, to solve a specific coordination problem: how do you get a small number of expensive, slow, human engineers to ship coherent software without stepping on each other? The daily standup answers “what is blocking you today, while you still have eight hours of typing to do?” The sprint plan answers “what is the realistic capacity of these humans over the next two weeks?” The retrospective answers “how do we make these humans slightly less frustrated next sprint?”

Every one of those questions assumes engineer hours are the binding constraint.

PFF dismantled the entire stack. Spitz lists what went: the product manager role, sprint planning, daily standups, sprint refinement, retrospectives. What replaced them is almost embarrassingly small. A half-hour huddle every other day. Engineers flag blockers in real time as they happen, not at 10 AM the next morning. The retrospective signal is replaced by a customer satisfaction survey, because the customers are the ones who know if last week’s work was good. The PM function, the spec writing, the ticket grooming, the status synchronization, all moved into agents.

This is not “we still do Scrum but with AI helping.” It is the explicit deletion of the ceremonies, on the explicit reasoning that the constraint they were designed for is gone.

The Workflow That Replaces It

Spitz described the loop PFF runs now, and it is worth tracing because the topology matters.

A spec comes in. An agent writes a Lightweight Design Document, which it composes by reading every prior LDD in the repository to learn what shape these documents take at PFF. Auto-generated tickets get created from the LDD, preserving non-blocking topology so independent work can proceed in parallel. Pull requests carry status that syncs automatically back to the ticket system. After merge, a QA agent spins up on staging and validates each ticket against its acceptance criteria.

The thing to notice is that this is not “agents help engineers code faster.” It is “agents replace the connective tissue between engineers.” The LDD, the tickets, the status updates, the QA passes. All the work that historically required a PM, a tech lead, a scrum master, a QA engineer, and the engineers themselves to keep in sync. Most of that work has nothing to do with writing code. It is coordination overhead, and coordination overhead is exactly the kind of work that agents are good at when the artifacts are structured and the rules are explicit.

The two engineers focus on the parts of the loop that still require taste: system design decisions, code review of architectural choices, and customer-facing judgment calls. Everything in between is delegated.

Code Review Splits, It Does Not Die

The most subtle move in Spitz’s redesign is the split he made on code review. He did not eliminate it. He bisected it.

Style review, naming conventions, “I would have done this differently” bikeshedding, opinion-driven feedback that nobody enjoys giving or receiving: agents handle that. System design review, architectural coherence, the question of whether the change fits the model of the platform: engineers handle that. His framing: “We use agents to do the code reviews engineers hate getting feedback from. Remove the whole emotional aspect out of it.”

This is one of those operational details that sounds small and is not. A meaningful share of engineering culture pain comes from peer review feedback delivered badly. Senior engineers who critique style, junior engineers who feel attacked, the slow erosion of psychological safety when feedback is technically correct but socially expensive. Moving the low-value review surface to an agent does not just save time. It removes a recurring source of organizational friction. The remaining human review is reserved for the conversations that actually require humans, which makes those conversations both more focused and more respected.

The principle generalizes. Anywhere in your engineering process where the work is rule-based but the delivery is emotionally fraught, the agent is the better operator.

Customer Satisfaction Went Up, Not Down

The piece of the case study that most resists the standard skepticism is the customer satisfaction number. Pre-AI baseline at PFF was around 7.0 to 7.5. Over the three-month experiment, average customer satisfaction landed at 8.6.

A common objection to AI-augmented engineering is that velocity comes at the cost of quality, and that customers will notice. PFF’s numbers, self-reported and at one company, point the other way. More frequent deploys mean shorter feedback loops, which means defects get caught faster and feature requests turn around faster. The QA agent running against acceptance criteria on staging catches a class of regressions that previously slipped through. The 25x deploy frequency is not 25x more risk surface; it is 25x more chances to detect and correct.

The caveat to underline: these numbers are disclosed by the CTO at a conference. They are not third-party validated. They reflect one company over three months. Treat them as an existence proof, not a benchmark to copy. The point is not “every team should expect 8.6 CSAT.” The point is “the assumption that AI velocity must trade against quality is at least one strong counterexample short of being safe.”

The Engineer Profile Shifts

Spitz called out a hiring and retention implication that most discussions of AI-augmented engineering skip. The new setup does not work for every engineer.

Engineers who thrive: the curious ones, willing to dig into unfamiliar systems, comfortable operating without a prescriptive specification handed to them. They treat the agent as a junior team that can take on work, but they take responsibility for the architectural direction. They are intrinsically motivated to figure out what should be built.

Engineers who struggle: the ones who require a fully specified Jira ticket before they begin work, who relied on the PM and the spec doc as the source of direction. The structural support those engineers needed has been removed, and the agents do not replace it. The agents amplify whatever direction the engineer provides, which is wonderful if the engineer has direction and difficult if the engineer was depending on the org to supply it.

This is a real organizational design question for any team contemplating the shift. The engineers who succeed in a post-ceremony environment are a specific profile. Hiring and management practices that filtered for “delivers reliably against tight specs” will produce a roster that does not match the new operating model.

Compounding, Not Linear

One earlier internal data point from PFF deserves attention. Before AI, the same feature set the two-engineer team shipped had been estimated at four months. The two-engineer team shipped in under two months, and one of the engineers was unblocked enough within the first month to start parallel work.

This is not a 2x speedup or a 5x speedup. It is a non-linear gain because the bottleneck shifted. When one engineer’s contribution unblocks not just themselves but also creates room for the agent fleet to operate on a second workstream, the team capacity compounds. The relevant variable is not “how fast can the engineer type” but “how many independent agent-driven workstreams can the engineer hold open at once.”

The implication for capacity planning is uncomfortable. The estimates your team produces today assume the old constraint. The estimates that match what you can actually ship, given the new tools, are different by a multiple that depends on how thoroughly you have inverted the org.

Do This Now

You do not need to dismantle Scrum next week. You do need to run a single, concrete exercise.

Pick the next two-week sprint. List every ceremony you run: standups, refinement, retrospective, sprint planning, demo. For each ceremony, write down the original problem it was solving. Most of those problems will turn out to be “humans need to coordinate scarce time on scarce keyboards.” Then look at which of those problems still exists in your environment now that agents are part of the team. Some will. Most will not.

That exercise is not a Scrum-killing exercise. It is a constraint-naming exercise. PFF did not delete ceremonies because ceremonies are bad. They deleted ceremonies because the constraints those ceremonies were solving had moved. The exercise is to find out, with honesty, which of your ceremonies are still solving a real problem and which are organizational muscle memory.

The teams that will out-execute the market over the next two years are not the ones that adopt agents. Almost everyone will adopt agents. They are the ones that redesign the surrounding organization to stop optimizing for a constraint that has moved.

This analysis synthesizes Agents Don’t Do Standups (Mike Spitz, PFF, AI Engineer Conference 2026), the PFF consumer and pro-team products, and prior Victorino analysis of the new operating model.

Victorino Group helps engineering leaders redesign org processes when engineer hours stop being the binding constraint. Let’s talk.