- Home
- The Thinking Wire
- Software Slop Is an Attention Problem
The word “slop” entered the software vocabulary sometime in 2025. Most people use it to mean “low-quality AI-generated code.” That definition is imprecise, and the imprecision matters. Code quality is subjective, context-dependent, and nearly impossible to measure at scale.
Nuno Job, writing on pscanf.com in April 2026, proposes a better definition. Slop is not about the code. It is about the attention. Specifically: slop is the distance between the review effort a piece of code requires and the review effort it actually received.
That reframing turns a vague complaint into a measurable signal.
The Formula
Job’s metric is simple. Every code contribution has an “attention cost” (how much review it needs) and an “attention spent” (how much review it got). The ratio between them produces a score from 0 to 5. Zero means the code was thoroughly reviewed. Five means nobody looked at it.
The formula: Score = 5 x (1 - min(attention_spent / attention_cost, 1)).
He built a tool called the Slop-O-Meter and ran it against public repositories. Facebook’s React scored 0.6. That tracks. React has one of the most rigorous review cultures in open source. OpenClaw, an AI-generated project, scored 3.8. SQLite, despite being famously well-tested, scored 3.0 because its commit patterns look like single-author work with limited external review.
Job is candid about limitations. The tool “utterly fails for many repos.” It uses proxy signals (commit patterns, review activity, contributor diversity) rather than measuring actual attention. But the framework underneath the tool is more important than the tool itself.
Why This Matters Now
Before AI coding tools, slop was structurally difficult to produce. You typed code by hand. The act of typing forced a baseline level of attention. You read each line as you wrote it. You caught obvious errors in real time. You made small decisions constantly: this variable name or that one, this pattern or that one. The cognitive engagement was inseparable from the physical act of production.
AI removed that constraint entirely.
A developer can now generate 500 lines of code in ten seconds. The code compiles. It passes basic tests. It looks reasonable. But nobody has read it with the kind of attention that 500 lines of logic actually demands. The attention cost stayed the same. The attention spent dropped to near zero.
This is the mechanism behind the numbers we have been tracking. As we documented in The AI Verification Debt, 96% of developers distrust AI output, but only 48% consistently verify it. The Slop-O-Meter gives that verification deficit a score. It puts a number on the distance between “this code exists” and “someone took responsibility for this code.”
Attention as the Scarce Resource
The conventional framing of AI productivity focuses on output. More code, faster. More features shipped. More pull requests merged. Job’s framework inverts that lens. The scarce resource was never code. It was always attention.
This inversion explains findings that otherwise seem contradictory. The Verification Tax showed that executives report saving 4.6 hours per week with AI while workers spend 3.8 hours checking AI output. The net gain is 16 minutes. If you think of AI as a code production tool, those numbers make no sense. If you think of code production as an attention allocation problem, they make perfect sense. AI increased production without increasing the attention budget. The budget had to come from somewhere, so workers spend their saved time on verification.
Simon Willison’s observation, which we explored in Cheap Code, Expensive Quality, cuts the same direction. Code generation dropped to near-free. Code verification did not. Willison frames this as an economic problem. Job frames it as an attention problem. They are describing the same asymmetry from different angles.
What Slop Scores Actually Measure
Job makes a claim that deserves careful consideration: code stops being slop the moment someone “carefully goes over it, edits to quality standards, and verifies it works.” The origin of the code is irrelevant. AI-generated code that receives thorough human review is not slop. Human-written code that ships without review is.
This maps directly to the cognitive debt problem that Martin Fowler and Margaret-Anne Storey identified. As we explored in Cognitive Debt: The Invisible Cost of AI-Generated Code, the danger is not bad code but code that nobody understands. Slop scores quantify the same risk from a process perspective rather than a knowledge perspective. Low attention means low understanding. Low understanding means high cognitive debt.
The measurement also exposes a seniority problem. Senior engineers spend 4.3 minutes reviewing each AI suggestion. Junior engineers spend 1.2 minutes. A repository where seniors review AI output will score lower on the Slop-O-Meter than one where juniors do. Same code, same tools, radically different slop scores. The variable is not the AI. It is the reviewer.
From Metric to Policy
The practical question is whether attention deficits can be governed, not just measured.
Job’s framework suggests they can. If you can estimate the attention cost of a code change (based on complexity, risk, system criticality) and measure the attention spent (based on review time, reviewer seniority, number of review passes), you can set thresholds. A change to a payment processing module with a slop score above 2.0 gets flagged. A change to a test fixture with a score of 3.5 gets waved through. The threshold reflects the actual risk, not a blanket policy that treats all code equally.
This is what verification policies need. Not “all AI code must be reviewed” (which creates unsustainable overhead) and not “trust the AI” (which creates unacceptable risk). Instead: allocate review effort proportional to the attention cost of the change. Measure whether that allocation actually happened. Flag the deficit when it compounds.
The Slop-O-Meter is admittedly crude. But the principle underneath it is sound. Organizations that can measure their attention deficit can manage it. Organizations that cannot measure it will discover it in production.
The Governance Implication
Every governance framework needs metrics to function. You cannot enforce what you cannot measure. The AI code governance conversation has been stuck on proxy measures: percentage of code that is AI-generated, number of vulnerabilities detected, developer satisfaction scores. None of these capture the core risk, which is insufficient human oversight of machine-generated output.
Slop, defined as attention deficit, is the missing metric. It measures the thing that actually predicts failure: the distance between how much scrutiny code needed and how much it got. It is imperfect, approximate, and early-stage. It is also the first serious attempt to quantify what every senior engineer already knows intuitively. The danger is not AI writing code. The danger is nobody reading it.
This analysis builds on Can We Measure Software Slop? by Nuno Job (April 2026), with additional context from Sonar’s 2026 State of Code Developer Survey (January 2026), the Foxit AI Productivity Study (March 2026), and Simon Willison’s Agentic Engineering Patterns (February 2026).
Victorino Group helps engineering organizations build verification policies that match review effort to actual risk. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation