The Four Pillars of AI-Era Software Are Not Co-Equal

Modern software development gets described as four interrelated pillars: speed and agility, AI-powered automation, improved visibility through testing and observability, and embedded security and governance. The framing is useful and almost everyone treats the pillars as co-equal line items, four things to fund in roughly equal measure. That symmetry is the mistake. Three of the four pillars do the same job: they multiply how fast code reaches production. Only one increases the rate at which you can verify that the code is safe to keep.

Count the inflows. Generative AI writes functions, tests, and documentation. McKinsey found it roughly halved the time on code documentation and meaningfully cut time on code generation and refactoring (McKinsey, June 2023). Low-code platforms let citizen developers ship working apps in days instead of sprints. DevOps automation removes the manual approval steps that used to slow releases: in the CD Foundation’s 2024 survey, 83% of developers were involved in DevOps activity, and broader use of those technologies tracked with faster change lead time and faster service restoration (CD Foundation, 2024). Three pillars, one direction. More code, faster, from more people.

Now count what scales review. The fourth pillar, visibility plus governance, is the only one whose job is to absorb that flood: to test it, observe it in production, prove it complies, and catch the change that should not have shipped. Fund governance as a fourth co-equal thing and it competes with speed for the same budget. Under deadline pressure, speed wins every quarter. Build it as the floor the other three stand on, and it becomes the thing that makes speed survivable.

The asymmetry is structural, not rhetorical

The asymmetry comes from arithmetic, before any judgment about valuing safety. If three of your four investments raise the volume of code entering the system and only one raises your capacity to review what entered, then at parity funding your review capacity falls behind by construction. The inflows compound. A team that doubles its generation throughput has doubled the load on the same review pipeline unless that pipeline scaled too.

The classic developer productivity story hid this because it measured the wrong axis. We have written before about how time-to-merge flatters the generation side and ignores the cost it pushes downstream (see why McKinsey measured the wrong thing). The four-pillar framing makes the same omission structural. Three pillars are throughput pillars. One is verification. Call them co-equal and you will quietly underfund the only one that does not make the problem worse.

What “verification capacity” actually means

Verification capacity reaches well past a headcount of reviewers. It is the full set of mechanisms that let you trust output without reading every line of it. Automated tests that fail loudly. Observability that surfaces a regression in production before a customer files a ticket. Policy checks that block a non-compliant change at the gate. Lineage that tells you which agent or which person produced a given diff, so a bad pattern can be traced and stopped at the source.

When the McKinsey satisfaction numbers came out, the headline was that 87% of developers using generative AI could focus on meaningful work, against 50% of those not using it (McKinsey, June 2023). Real gain, worth having. The quiet condition underneath it: that focus holds only while someone, or something, is still catching the output that should not ship. Strip the verification layer and the same throughput that freed the developer becomes the thing that buries the reviewer. The satisfaction number is a downstream effect of the floor holding.

The review wall your velocity chart cannot see

Velocity charts measure inflow. Stories closed, PRs merged, deploys per day. They go up and to the right exactly as you would hope while you fund the three throughput pillars. The chart shows nothing about whether your capacity to verify kept pace, because verification failure does not show up as slower velocity. It shows up later, as incidents, as a compliance finding, as a production regression nobody caught because the observability for that path was never built, as the slow accumulation of code that works today and nobody understands tomorrow.

That is the trap of treating governance as the fourth co-equal pillar. The cost of underfunding it is invisible on the dashboard that the other three pillars light up. By the time the cost is visible, it has already arrived as an incident, after the budget window closed. The teams that hit this wall did not see it coming, because the instrument they trusted only measured the half of the system that was working.

Build the floor first

The fix lives in the ordering of decisions, before any budget increase. Before you scale any throughput pillar, ask what verification mechanism absorbs its output. Turning on an AI coding assistant for the team raises generation throughput now. The matching question is whether your test coverage, your production observability, and your policy gates can catch what that assistant will produce at volume. If they cannot, you have funded inflow without funding the floor, and you have started a countdown to the review wall.

Concretely, this maps to the measurement work we keep returning to. Instrument the pair, not just the model (measure the team, not the model). Close the loop between what agents produce and what governance observes (the observability and governance loop). Keep humans on the loop where verification still needs judgment (on the loop, not in the loop). Those are not separate initiatives bolted onto delivery. They are the floor, and the three throughput pillars are only safe to the height that floor can hold.

Do this now

Take your current AI-delivery roadmap and label every initiative as throughput or verification. Generation tooling, low-code rollout, CI/CD automation: throughput. Test infrastructure, production observability, policy gates, lineage: verification. Then total the investment on each side. If throughput outweighs verification by more than a little, you are funding the inflow and starving the floor, and your velocity chart will keep climbing right up until the day it cannot. Rebalance before the wall, while the fix is still a budget line and not an incident postmortem. The binding constraint in AI-era delivery has moved: out of how fast you can generate code, and into how fast you can trust it.

This analysis synthesizes Unleashing developer productivity with generative AI (McKinsey & Company, June 2023), State of CI/CD Report 2024 (CD Foundation, 2024).

Victorino Group helps teams build verification capacity into AI delivery before the review wall arrives. Let’s talk.