The Scientific Loop Has Four Roles. AI Only Gets One of Them.

On May 25, Alejandro Piad Morffis published a short essay called AI is doing something weird to Science. It does what most takes on AI and science do not. It refuses the binary of “AI is replacing scientists” versus “AI is just a tool.” Instead, it decomposes what scientists actually do into four roles and asks which ones survive contact with a large language model.

The four roles are: poser, proposer, verifier, curator. Read them slowly. Most discussions of AI in knowledge work collapse all four into a single thing called “the human” or “the expert.” Piad pulls them apart, and once they are apart, the load-bearing role becomes obvious. It is not the one most people assume.

This matters far outside science. The same four roles are present in legal review, financial analysis, marketing production, and any other knowledge workflow where AI is now generating candidates. If your organization cannot point to the verifier, you do not have governance. You have decoration.

What Piad actually proposed

The four roles, in Piad’s own words and structure:

Poser. Decides what is worth solving. Names the question. Sets the frame. In Piad’s account, this remains exclusively human. Not because LLMs cannot generate questions, but because the choice of which question matters is an act of taste, judgment, and stakes that no model can hold.

Proposer. Generates candidate solutions, fast. This is where the LLM lives. Piad is precise about the title: “Not discoverer, not author, not scientist. The one that generates candidates fast enough that the verifier can find something in the haystack.” The proposer’s job is volume and variety, not correctness.

Verifier. Checks whether a candidate is actually true. In Piad’s four documented cases, the verifier is never another LLM. It is formal logic (Lean), a combinatorial proof checker, a wet-lab experiment, a crystallography measurement. The verifier cannot be fooled by plausible-sounding falsehoods, which is exactly what LLMs excel at producing.

Curator. Decides which surviving candidates are worth pursuing further. This is human again. The verifier tells you something is true; the curator tells you it is interesting, fits a research program, advances the field. Truth is necessary but not sufficient.

Piad’s punchline is direct: “The verifier is the one that matters. A loop with a weak proposer and a strong verifier still produces valid science, it is just slow.” Reverse the sentence and the implication is brutal. A loop with a strong proposer and a weak verifier produces fast nonsense at scale.

The cases are not new. The naming is.

Piad walks through four examples. Claude’s Cycles work in combinatorics, where Claude proposed candidate constructions and a formal checker verified them. Terence Tao’s Lean-assisted mathematics, where Tao directs the question and curates the result while Lean does the verification. AlphaFold, where the model proposes protein structures and crystallography verifies them. GNoME, where the model proposes candidate materials and physical synthesis verifies them.

He also reaches back to 1976. The Appel-Haken proof of the four-color theorem used the same loop structure: a human posed the question, a program generated candidate configurations, a verifier checked each one, and humans curated the surviving result into a proof. We have been running this loop for fifty years. We just never named the roles.

This is the move that makes the essay useful. Piad did not discover a new architecture. He gave a name to a pattern that was already running, and once the pattern is named, you can test for it.

The test, exported

Take the four roles to any AI deployment outside science and ask:

Legal review. A firm deploys an LLM to summarize contracts and flag risks. Who is the poser? (The partner who decides which clauses matter.) Who is the proposer? (The model.) Who is the verifier? (Here it gets uncomfortable. Often the answer is “another associate reading the summary,” which is just a slower proposer. A real verifier would be a clause-level rule engine, a citation checker against case law, a structured diff against a known-good template.) Who is the curator? (The partner again, deciding which flagged risks deserve client conversation.)

Most legal AI deployments today have a poser, a proposer, a curator, and no verifier. The associate is performing verification theater. The model produces plausible-sounding falsehoods. The associate, under time pressure, reads them as competent summaries. The curator inherits unverified material as if it were verified.

Financial analysis. Same exercise. Who poses the question? (The CFO.) Who proposes the analysis? (The model running over the data.) Who verifies? (A reconciliation engine, a deterministic formula check, a cross-reference against the source ledger. Not another LLM “double-checking” the first.) Who curates? (The CFO, again.)

When the verifier is missing, finance teams end up with elegant narratives that footnote nothing checkable. The pattern Piad warns about in science shows up identically in the boardroom.

Marketing production. A team uses AI to produce a hundred ad variants. Poser: brand strategist. Proposer: the model. Verifier: … brand guidelines compliance check? Legal review? A/B test against actual user behavior? Most teams skip straight from proposer to curator and call the creative director’s eyeball the verifier. The creative director cannot scale to a hundred variants, so the verification quietly does not happen.

In all three cases, the failure mode is the same: an LLM is doing both proposing and verifying. Piad’s framework names why this cannot work. The proposer optimizes for plausibility. The verifier must optimize for truth. You cannot do both with the same instrument.

Why “human in the loop” is the wrong abstraction

Most AI governance frameworks demand a “human in the loop.” Piad’s decomposition exposes the imprecision. Which human? Doing which job? At which stage?

A human acting as curator after the verifier has done its work is governance. A human acting as poser before the proposer runs is governance. A human acting as verifier on the output of an LLM proposer, without formal checking infrastructure behind them, is performance art. They are being asked to do, by reading, what a non-LLM system needs to do by construction.

This is why so many “human review” deployments degrade. The reviewers are honest. They are also human, tired, and reading plausible prose. They cannot verify what the system has not made verifiable.

What to do this week

Three actions, ordered by leverage:

Run the four-role test on your most-deployed AI workflow. Write the four names. Assign each to a person or system. If the verifier slot is “the human reviewing the output,” you have no verifier.
Name what would have to be true for a real verifier to exist. It is rarely another AI. It is usually a rule engine, a formal checker, a deterministic system of record, or a test environment. Often it does not exist yet. That is the work.
Stop calling reviewers “verifiers.” Reviewers are curators. They decide what merits attention. They are not equipped to catch plausible falsehoods at scale. The naming honesty alone changes how leaders allocate budget.

Piad gave us a tool. The tool is small enough to use on a Monday and sharp enough to expose where governance ends and theater begins.

This analysis synthesizes AI is doing something weird to Science by Alejandro Piad Morffis (May 2026).

Victorino Group helps leadership teams export Piad’s four-role test into legal, financial, and marketing workflows, naming the independent verifier that turns “human in the loop” from posture into structure. Let’s talk.