What Changed in the Opus 4.7 System Prompt Is a Governance Reference

TV
Thiago Victorino
5 min read
What Changed in the Opus 4.7 System Prompt Is a Governance Reference
Listen to this article

On April 18, Simon Willison published a side-by-side diff of the system prompts Anthropic ships with Claude Opus 4.6 and 4.7. It is a short post. It is also the most useful governance document released this month, and not for the reasons most readers will take from it.

The obvious reading is that Willison found the new prompt. The less obvious reading is that the diff exists at all, is legible to practitioners, and can be cited. That last part is what makes it a reference.

What actually changed

Five verified differences between 4.6 and 4.7, straight from the diff:

  1. Tool count: 23. The 4.7 prompt ships with 23 tool definitions. Every one of them is a declared surface where the model is allowed to act.
  2. Knowledge cutoff moved to January 2026. A small line with large implications for how the model narrates its own temporal context.
  3. Expanded child-safety block. The prompt now includes a stickier rule: “Once Claude refuses a request for reasons of child safety, all subsequent requests must be approached with extreme caution.” That is persistent state encoded in prose.
  4. Reduced verbosity. Explicit instruction to be shorter. A tone constraint written as a sentence, not tuned through a decoding parameter.
  5. Anti-”genuinely” language removed. The previous prompt had a direct instruction not to overuse the word “genuinely” because it had become an over-represented tic in outputs. The new prompt drops it, which suggests either the training handled it or the team decided the prompt was the wrong place for that fix.

Those five items are worth sitting with before reaching for a broader point. They are not style. They are five different governance mechanisms, each in the same document.

Why the diff matters more than the contents

Most vendor governance is invisible. You can read a model card, skim a usage policy, and infer some rails from behavior. You cannot usually read the runtime instructions.

The Opus system prompt is readable because it is injected at inference time and can be extracted with a cooperative prompt. That means external observers, without any privileged access, can do three things enterprise teams almost never do with their own guardrails:

  • Compare versions. 4.6 vs. 4.7 is a real diff. Lines added, lines removed, lines rewritten.
  • Count surfaces. 23 tools is not a number we had to ask for. It is visible.
  • Cite specific language. The child-safety sentence is quotable. That makes it usable in a review document.

This is meta-governance. Not the governance itself, but the auditability of the governance. Enterprise teams building their own agent guardrails rarely produce artifacts with those properties. Most internal system prompts live in a config file no one diffs, owned by whoever last touched it.

Use it as reference, not template

The reflex on seeing a well-crafted vendor prompt is to copy it. Resist that. A vendor prompt is tuned for a specific model, a specific tool set, a specific population of users, and a specific risk model. What maps across is the shape of the decisions, not the text.

Three patterns from the Opus diff that translate:

  • Persistent refusal state. Encoding “once refused for reason X, stay cautious” is a design pattern. Your equivalent might be “once a user is flagged for attempted data exfiltration, downgrade tool access for the rest of the session.” Same shape, different content.
  • Tone constraint as prose. “Reduce verbosity” sits next to behavior rules in the same document. If you want shorter answers in your agent, the prompt is a fine place to say so, and a fine place to review when the behavior drifts.
  • Retirement of ad-hoc fixes. The removal of the “genuinely” instruction is a quiet signal that the team treats the prompt as revisable, not sacred. Your prompts should move the same way. Instructions written to patch a training artifact should come out when the artifact is fixed upstream.

What this does not prove

The prompt is one layer. It sits above RLHF, constitutional training, tool-layer hooks, moderation classifiers, and whatever routing happens before the model sees a request. Reading the prompt tells you what the vendor chose to say explicitly at inference time. It does not tell you what the model learned, what the tools refuse, or what the policy engine strips.

So do not overclaim. The diff makes one layer legible. It does not make the stack auditable. Treat it the way you would treat a single, high-quality log line: useful, real, and insufficient on its own.

As we argued in Claude Code Auto Mode Governance, runtime behavior sits on top of configuration, and both need review. And as we laid out in Claude Constitution Enterprise Governance, the constitution Anthropic publishes is training-time governance; the system prompt is inference-time governance. They are different artifacts with different audiences. The Opus 4.7 diff is a reminder that the inference-time layer has its own lifecycle.

The practitioner takeaway

If you run agents in production, do these three things this week:

  1. Diff your own system prompts. If you cannot produce a 4.6-vs-4.7-style diff of your last three prompt revisions, your governance is not yet reviewable.
  2. Count your tools. A tool count is a surface count. It should appear in a review doc, not be discovered by reading a repo.
  3. Write persistent rules in prose. When a rule is stateful (“once X, then Y”), write it that way. A prompt is a document. Treat it like one.

The Opus 4.7 prompt is not a template to copy. It is an existence proof that prompt-layer governance can be legible enough to cite. That is the bar to clear for your own.


This analysis synthesizes Simon Willison’s Changes in the System Prompt Between Claude Opus 4.6 and 4.7 (April 2026).

Victorino Group helps teams design auditable prompt-layer governance for production agents. Let’s talk.

All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →

If this resonates, let's talk

We help companies implement AI without losing control.

Schedule a Conversation