Willison's detailed diff between Opus 4.6 and 4.7 system prompts demonstrates that Anthropic modifies refusal behavior, code generation patterns, and capability framing through prompt changes alone — independently of model weight updates. He treats system prompts as version-controlled artifacts deserving the same scrutiny as changelogs, and has built a practice of extracting and comparing them across Claude, GPT, and Gemini releases.
Submitted Willison's post to Hacker News where it reached 282 points and 166 comments, signaling strong community agreement that these hidden prompt diffs are worth surfacing and discussing publicly.
The editorial argues that developers who pin to a specific model version like claude-opus-4-7-20260415 are only locking in weights — the default system prompt wrapping API calls can still shift behavior and break downstream assumptions. If a Claude-powered feature starts behaving differently after a version bump, checking only the model changelog misses half the picture because the system prompt is doing real behavioral work on edge cases and refusal aggressiveness.
Willison's diff shows concrete modifications to how Claude handles refusals, tool use, and capability self-description — demonstrating that prompt changes produce a measurably different assistant even when model weights are identical. His comparison reveals that some changes tighten guardrails while others loosen them, meaning the net behavioral effect is a deliberate tuning knob Anthropic adjusts between releases.
The editorial explicitly frames the system prompt as 'an instruction set that determines how the model handles edge cases, how aggressively it refuses requests' — reinforcing that these are engineering decisions with concrete downstream effects, not cosmetic text changes.
Simon Willison — the developer behind Datasette and one of the most meticulous documentarians of AI model behavior — published a detailed comparison of Claude's default system prompt between Opus 4.6 and Opus 4.7 on April 18, 2026. The post, which quickly climbed to 282 points on Hacker News, lays out the specific additions, removals, and rewording Anthropic applied to the instructions that sit between the model weights and the user's actual conversation.
This is not the first time Willison has done this. He's built a practice of extracting and diffing system prompts across Claude, GPT, and Gemini releases — treating them as version-controlled artifacts that deserve the same scrutiny as changelogs. The core finding: Anthropic uses system prompt changes as a behavioral steering mechanism that operates independently of model training, and these changes often go undocumented in release notes.
The diff itself — available in full on Willison's blog — shows modifications to how Claude is instructed to handle refusals, code generation patterns, tool use behavior, and the framing of its own capabilities. Some changes tighten guardrails; others loosen them. The net effect is a measurably different assistant, even when the underlying model weights are identical.
For developers building production applications on Claude's API, system prompt changes are a hidden variable. When you pin your application to `claude-opus-4-7-20260415`, you're locking in the model weights — but the default system prompt that wraps your API calls can shift behavior in ways that break downstream assumptions. If your Claude-powered feature started behaving differently after a version bump and you only checked the model changelog, you missed half the picture.
This matters because the system prompt is doing real work. It's not a polite preamble — it's an instruction set that determines how the model handles edge cases, how aggressively it refuses requests, how it structures code output, and how it represents uncertainty. A single sentence change in the system prompt can flip the model from "here's how to do that" to "I can't help with that" for an entire category of requests.
The Hacker News discussion surfaced a legitimate tension in the AI developer ecosystem. On one side: Anthropic needs the flexibility to improve safety and behavior without cutting a new model version for every tweak. On the other: developers who are paying for API access expect behavioral stability, or at minimum, a changelog. The system prompt sits in an uncomfortable middle ground — it's not part of the model, but it's not part of the developer's code either.
Willison's approach of treating these prompts as diffable artifacts is arguably more useful than Anthropic's own release notes. A raw diff tells you exactly what changed. A blog post saying "improved safety and helpfulness" tells you nothing actionable.
Most developers building on LLM APIs treat the model as a function: input goes in, output comes out. The system prompt — whether default or custom — is the function signature, and when it changes without notice, you've got a runtime contract violation.
There are two practical responses. First, if you're using Claude via the API, always supply your own system prompt. Don't rely on the default. This insulates you from upstream changes and gives you version control over the behavioral contract. Anthropic's default system prompt is designed for the general-purpose chatbot experience; if you're building a code review tool or a medical triage assistant, you should have been overriding it anyway.
Second, build behavioral regression tests. Not unit tests on outputs (those are too brittle with LLMs), but behavioral assertions: "given this input category, the model should attempt an answer rather than refuse," or "code outputs should include error handling." When you upgrade model versions, run your behavioral suite before deploying — and if something breaks, check the system prompt diff before blaming the weights.
This is the same discipline that backend engineers apply to database migrations or API versioning. The only difference is that the LLM ecosystem hasn't built the tooling yet. Willison's blog posts are, in effect, manual migration notes for the industry.
Anthropic publishes more about its safety research and model design than most competitors. But system prompt changes occupy a documentation gap. They're not covered by the model card, they're not in the API changelog, and they're not visible through the API itself. You have to extract them — typically by asking the model to repeat its instructions, a technique that works inconsistently and that providers actively try to prevent.
This creates an odd dynamic where the most important behavioral documentation for Claude comes from a third-party blogger in the UK rather than from Anthropic's own developer relations team. It's a credit to Willison's thoroughness, but it's also a structural problem. Developers shouldn't need to rely on prompt archaeology to understand why their AI-powered features changed behavior overnight.
The counterargument — and it's not unreasonable — is that system prompts contain safety-relevant instructions that Anthropic doesn't want to make trivially bypassable. Publishing exact diffs could enable jailbreaking. But this is a false binary: Anthropic could publish behavioral change summaries ("refusal thresholds for X category tightened") without revealing exact prompt text. The fact that they don't suggests this is a prioritization gap, not a deliberate policy.
If you're building on any LLM API — not just Claude — treat the system prompt as a first-class dependency. Pin it. Version it. Test against it. When you upgrade models, diff the default system prompts (Willison maintains a collection) and map changes to your application's behavioral expectations.
For teams running Claude in production, the specific 4.6 → 4.7 changes Willison documented should be reviewed against your use cases. If you're relying on default behavior for code generation, tool use, or nuanced refusal handling, test those paths explicitly on 4.7 before upgrading.
More broadly, this is a maturity signal for the LLM tooling ecosystem. We need system-prompt-aware CI pipelines the way we need schema-aware database migrations. The teams that build this discipline now will spend less time debugging mysterious behavior regressions in six months.
Willison's system prompt diffs have become a de facto industry standard for understanding model behavior changes. The fact that a solo blogger provides better behavioral documentation than billion-dollar AI labs is both impressive and concerning. As LLMs move deeper into production infrastructure, the gap between "model release" and "behavioral documentation" will become a real operational risk. The question isn't whether AI providers will start publishing system prompt changelogs — it's whether they'll do it before a high-profile production incident forces their hand.
> The new <acting_vs_clarifying> section includes: When a request leaves minor details unspecified, the person typically wants Claude to make a reasonable attempt now, not to be interviewed first.Uff, I've tried stuff like these in my prompts, and the results are never good, I much pre
> Claude keeps its responses focused and concise so as to avoid potentially overwhelming the user with overly-long responses. Even if an answer has disclaimers or caveats, Claude discloses them briefly and keeps the majority of its response focused on its main answer.I am strongly opinionated aga
The eating disorder section is kind of crazy. Are we going to incrementally add sections for every 'bad' human behaviour as time goes on?
I'm fascinated that Anthropic employees, who are supposed to be the LLM experts, are using tricks like these which go against how LLMs seem to work.Key example for me was the "malware" tool call section that included a snippet with intent "if it's malware, refuse to edit the
Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.
The past month made me realize I needed to make my codebase usable by other agents. I was mainly using Claude Code. I audited the codebase and identified the points where I was coupling to it and made a refactor so that I can use either codex, gemini or claude.Here are a few changes:1. AGENTS.md by