Agentic Engineering: Why What You Call AI-Assisted Development Matters

Addy Osmani published a piece this week that deserves more than the usual tech-Twitter half-life. His argument: the distinction between “vibe coding” and “agentic engineering” isn’t semantic pedantry. It’s the difference between treating AI as entertainment and treating it as engineering infrastructure.

He’s right. But the implications run deeper than his article explores.

The Vocabulary Problem

When Andrej Karpathy coined “vibe coding” in early 2025, he was describing something real: the experience of prompting an AI, accepting its output, and moving on without deep review. A feeling-first approach to development.

The term caught on because it was fun. And that’s exactly the problem.

“Vibe coding” entered corporate vocabulary as a casual, almost playful label. It gave organizations permission to treat AI-assisted development as an experiment without structure. No review processes needed. No quality gates required. No accountability frameworks demanded. The name itself signaled: this isn’t serious enough to govern.

Osmani’s counter-term — agentic engineering — does the opposite. It implies architecture. Review. Ownership. Standards. The moment you call something “engineering,” you invoke an entire ecosystem of professional expectations: testing, documentation, code review, production monitoring.

This is not about being pedantic with labels. The vocabulary an organization adopts for AI-assisted development determines what governance structures it builds around it.

Names Create Infrastructure

Consider what happens in practice when a team says “we do vibe coding”:

No formal review process for AI-generated code
No distinction between prototype and production
No accountability for output quality
No skill development requirements
No measurement framework

Now consider what happens when a team says “we practice agentic engineering”:

Code review applies to AI output as it does to human output
Architects define boundaries before agents execute
Tests are written before delegation, not after
Engineers own the codebase regardless of who — or what — wrote it
Quality metrics exist and are tracked

Same tools. Same models. Same capabilities. Radically different organizational outcomes. The name didn’t change the technology. It changed the human systems around the technology.

When Vibe Coding Is Exactly Right

Osmani is careful — and correct — to note that vibe coding has legitimate uses. Hackathons. Personal scripts. MVPs where speed outweighs durability. Learning exercises where the goal is understanding, not production quality.

Simon Willison put it precisely in March 2025: if you wrote code with an LLM, reviewed it, tested it, and can explain it — that’s software development. The distinction isn’t about the tool. It’s about the discipline applied after the tool produces output.

The problem is not that vibe coding exists. The problem is when organizations use vibe-coding practices at production scale and call it engineering. That’s not a tool problem. It’s a governance failure.

The Seniority Amplifier

Osmani raises a point that deserves more attention: agentic engineering disproportionately benefits senior engineers. Those with deep architectural knowledge, pattern recognition, and judgment about trade-offs get more from AI agents than those without.

This isn’t surprising, but the mechanism matters. Senior engineers write better specs. Better specs produce better AI output. Better output requires less correction. Less correction means faster iteration. The advantage compounds.

For junior engineers, the risk is what Osmani calls “skill atrophy” — the gradual erosion of fundamental capabilities when AI handles the implementation you’d otherwise learn from. A junior developer who never debugs a race condition manually may never develop the intuition to recognize one in AI-generated code.

The organizational implication: agentic engineering isn’t just a workflow change. It’s a talent strategy. Companies that adopt it need deliberate plans for developing junior engineers who can eventually become the senior architects directing the agents.

The Spec-Quality Feedback Loop

One of Osmani’s most important observations is that solid engineering practices improve AI output quality. Better specs, comprehensive tests, and clean architecture don’t just help human developers — they make agents more effective.

This creates a virtuous cycle that organizations mostly ignore: investing in specifications, documentation, and test infrastructure pays double dividends. Once for human comprehension. Once for agent performance.

We wrote about this principle in detail in our guide on writing specs for AI agents. The organizations that treat spec-writing as overhead will get mediocre AI output. The organizations that treat specs as their primary engineering artifact will get disproportionate returns.

The METR study from 2025 found that developers using AI were 19% slower despite feeling 20% faster. One likely explanation: they were vibe coding. Prompting without specs, accepting without review, debugging without tests. The feeling of speed without the substance of it.

The Governance Gap, Again

This connects to the pattern we’ve been tracking since the beginning of this year: the distance between AI capability and organizational readiness.

The tools work. Claude, Codex, Cursor — they produce functional code at a speed that would have been inconceivable three years ago. The models aren’t the bottleneck. The organizations are.

What’s missing isn’t technical capability. It’s the governance layer: who reviews AI output, what standards apply, how quality is measured, what happens when something fails in production. These are organizational decisions, not technical ones.

The vocabulary an organization uses reveals where it sits on this readiness spectrum. “We’re experimenting with vibe coding” means: we have no governance yet. “We practice agentic engineering” means: we’ve built the human systems to match the machine capabilities.

Neither is wrong in absolute terms. But only one scales to production.

What This Means For Your Organization

Stop treating the name as trivial. The label your team uses for AI-assisted development shapes how leadership budgets for it, how HR writes job descriptions around it, how legal evaluates liability for it, and how quality teams audit it. Pick the name deliberately.

Audit your current practice honestly. If your developers prompt AI, accept output, and ship — you’re vibe coding, regardless of what you call it. That’s fine for prototypes. It’s negligent for production systems your customers depend on.

Build the governance before scaling the tools. Osmani’s workflow — plan first, direct and review, test relentlessly, own the codebase — isn’t just good engineering advice. It’s the minimum viable governance framework for AI-assisted development. Without it, more AI capability just means more ungoverned code in production.

Invest in specs, not just tools. The spec is the new bottleneck. The quality of what you tell the agent determines the quality of what the agent produces. Organizations that underinvest in specification and over-invest in tooling will get exactly the results they deserve.

Osmani closes with an observation worth internalizing: AI didn’t cause the quality problems. Skipping the design thinking did. The discipline was always the point. AI just made the consequences of skipping it more visible — and more expensive.

Sources

Addy Osmani. “Agentic Engineering.” addyosmani.com, February 2026.
Addy Osmani. Beyond Vibe Coding. O’Reilly Media, September 2025.
Simon Willison. On code review and AI-assisted development. simonwillison.net, March 2025.
Andrej Karpathy. Original “vibe coding” observation. X/Twitter, 2025.
METR. “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” metr.org, July 2025.
Kent Beck. “Programming Deflation.” tidyfirst.substack.com, September 2025.

At Victorino Group, we help organizations build the governance layer that turns AI capability into reliable engineering outcomes. If you’re moving from experimentation to production, let’s talk.