The AI Control Problem

The Trust Gap Is the Governance Gap

TV
Thiago Victorino
8 min read

Stack Overflow’s 2025 Developer Survey asked 49,000 developers about AI. The headline finding: 84% now use or plan to use AI coding tools, up from 76% the year before.

Buried further down: trust in AI accuracy dropped from roughly 40% to 33%. And 46% of developers actively distrust AI-generated output.

Read those numbers together. More people are using AI. Fewer people believe it works correctly. That is not a contradiction. It is a signal.

The Paradox That Isn’t

The instinct is to treat declining trust as a problem to solve with better models, better training data, or better knowledge management. Stack Overflow’s CEO, Prashanth Chandrasekar, frames it exactly this way. In a January 2026 McKinsey interview, he argues that “knowledge quality is THE foundation of trust” and that organizations need better knowledge tools to restore confidence.

This framing is convenient. Stack Overflow sells knowledge tools. The company’s traffic has dropped roughly 50% since ChatGPT launched in late 2022. Its new product, Stack Overflow Internal, is a knowledge management platform for enterprises. When the CEO says the solution is better knowledge infrastructure, he is describing his product roadmap.

Set aside the sales pitch. Look at what the data actually shows.

Sonar’s 2026 State of Code report surveyed 1,149 developers and found that AI-generated code now accounts for 42% of committed code. Yet 96% of those developers said they do not fully trust AI output. Only 48% always verify what AI produces before committing it.

Let that sink in. Nearly half the codebase is machine-generated. Fewer than half the developers verify it. And almost nobody fully trusts it.

This is not a knowledge problem. It is an organizational controls problem.

Verification Debt Is the Real Metric

LinearB’s 2026 Engineering Benchmarks analyzed 8.1 million pull requests across 4,800 engineering teams. One number tells the story: AI-generated pull requests have a 32.7% acceptance rate. Human-written pull requests: 84.4%.

Two out of three AI pull requests get rejected. That is not a tooling failure. That is a quality signal. The code is being produced faster than it can be verified, and reviewers are catching the difference.

When organizations track this metric, they discover something useful: the trust deficit has a number. It is measurable. It is 32.7% versus 84.4%. You can track it over time. You can set targets. You can hold teams accountable for improving it.

Most organizations do not track it. They measure adoption rates and lines of code generated, which are input metrics. They celebrate that developers are using AI more. They do not measure whether the output is any good.

This is like measuring a factory’s productivity by how many parts enter the assembly line, not how many finished products pass quality inspection.

The “Almost Right” Problem

Stack Overflow’s survey found that 66% of developers spend more time fixing “almost-right” AI code than they would have spent writing it from scratch. This maps to what METR, a research organization, found in a rigorous randomized controlled trial: experienced developers were 19% slower with AI assistance on real tasks. They believed they were 24% faster.

The perception mismatch matters. If developers think AI is saving them time while it is actually costing them time, no amount of individual discipline will fix the problem. You need organizational measurement.

The “almost right” phenomenon is particularly corrosive because it is invisible to management. The code compiles. The tests pass (if tests exist for that code path). The pull request looks reasonable. The bugs are subtle: edge cases missed, error handling incomplete, security assumptions wrong. They surface weeks or months later, in production, where the cost of fixing them is orders of magnitude higher.

Veracode’s 2025 analysis found that 40-48% of AI-generated code contains security vulnerabilities, across over 100 LLMs tested. That is not a number that improves with better prompting. It is a number that requires verification infrastructure.

Why the Pilot Works and the Rollout Fails

Chandrasekar, in the same McKinsey interview, describes what he calls the pilot-to-production collapse. Pioneer teams show 20-30% productivity gains from AI tools. Organization-wide rollouts drop to single-digit improvements. McKinsey’s own January 2026 data confirms this pattern.

The standard explanation is that pioneer teams are more skilled, more motivated, more carefully selected. True, but incomplete. Pioneer teams also operate with tighter feedback loops, more direct oversight, and higher verification standards. They are, in effect, running a governed experiment.

When the tool rolls out to the broader organization, the governance does not scale with it. The adoption scales. The verification does not. The measurement does not. The quality gates do not. You get more AI-generated code with less oversight per line.

Sonar’s finding that 35% of developers access AI through personal accounts is the clearest indicator. One in three developers is using AI tools that the organization cannot monitor, audit, or govern. This is shadow AI operating at scale, producing code that enters production without organizational awareness.

Trust Is Not a Feeling. It Is Infrastructure.

Here is the reframe that matters: declining developer trust is not a crisis. It might be the healthiest thing happening in software engineering right now.

Developers are right to distrust AI output. The acceptance rates prove it. The security analysis proves it. The controlled experiments prove it. Skepticism is the correct response to a tool that produces plausible-looking output with a 40-48% vulnerability rate.

The problem is not that developers distrust AI. The problem is that organizations have no infrastructure to convert that healthy skepticism into systematic quality control.

When a developer says “I don’t trust AI output,” what they mean is: I have no reliable way to verify it at the pace I am expected to produce it. There is no organizational standard for what “verified” means. There is no escalation path when I find problems. There is no one measuring whether verification is happening.

That is a governance vacuum, not a trust crisis.

What Governance Infrastructure Looks Like

Four things distinguish organizations that are closing this deficit from those accumulating it.

Measurable trust metrics. Not surveys. Not vibes. Actual acceptance rates for AI-generated code, defect rates compared to human-written code, verification ratios, and time-to-detection for AI-introduced bugs. LinearB’s 32.7% vs. 84.4% is the kind of metric that should be on every engineering leader’s dashboard.

Operating model changes. Who reviews AI-generated code? Is it the same process as human code review? What percentage of AI output requires verification? What is the escalation path when a reviewer finds a pattern of AI-generated defects? These are not technical questions. They are organizational design questions.

Automated verification gates. Developer discipline does not scale. When 42% of code is AI-generated and only 48% of developers verify it, you cannot rely on individual behavior. You need automated quality gates that flag AI-generated code for additional scrutiny: static analysis tuned for common AI failure patterns, security scanning calibrated to the vulnerability profiles that Veracode documented, and integration tests that cover the edge cases AI consistently misses.

Governance frameworks that scale with adoption. The pilot team had ten developers and tight oversight. The org-wide rollout has two thousand developers and the same oversight budget. Governance infrastructure must scale at the same rate as adoption, or verification debt compounds silently.

The Forecast Is Clear

Gartner named agentic AI oversight the number one cybersecurity trend for 2026. Forrester predicts an agentic AI breach this year, specifically from governance-less deployment. These are not fringe predictions. They are mainstream analyst consensus.

The pattern is familiar. Organizations adopt a powerful technology. They measure adoption. They celebrate growth. They defer governance. Something breaks. Then governance becomes urgent.

We have seen this cycle with cloud migration, with mobile, with microservices. Every time, the organizations that built governance infrastructure before the breach outperformed those that built it after. The cost difference is not incremental. It is structural.

AI-generated code at 42% of committed code and rising is not a future scenario. It is the present reality. The question is whether your organization has the infrastructure to govern it, or whether you are accumulating verification debt and hoping that developer skepticism is a sufficient control.

It is not. Skepticism without infrastructure is just worry.


Sources

  • Stack Overflow. “2025 Developer Survey.” 49,000+ respondents. Trust in AI accuracy: 33%, down from ~40% prior year. 84% use or plan to use AI tools.
  • Sonar. “State of Code 2026.” 1,149 developers. 42% of committed code AI-generated. 96% do not fully trust AI output. 48% always verify. 35% use personal accounts.
  • LinearB. “2026 Engineering Benchmarks.” 8.1M pull requests, 4,800 teams. AI PR acceptance rate: 32.7% vs. manual: 84.4%.
  • METR. 2025 randomized controlled trial. Experienced developers 19% slower with AI assistance. Perceived 24% faster.
  • Veracode. 2025 analysis. 40-48% of AI-generated code contains security vulnerabilities across 100+ LLMs.
  • McKinsey. January 2026 interview with Prashanth Chandrasekar (Stack Overflow CEO). Pilot-to-production productivity collapse documented.
  • Gartner. “Top Cybersecurity Trends 2026.” Agentic AI oversight ranked #1.
  • Forrester. 2026 prediction: agentic AI breach from governance-less deployment.
  • Kiteworks. 2026 report. 63% of organizations cannot enforce AI purpose limits. 60% cannot terminate misbehaving agents.

Victorino Group helps organizations build governance infrastructure for AI-generated code. If your team is producing faster than it can verify, that is the problem we solve. Reach out at contact@victorinollc.com or visit www.victorinollc.com.

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation