Ready documents identical prompts from competitor-flagged accounts returning buggy code, incomplete summaries, and agentic traces requiring 30-40% more steps — none of which trigger standard refusal classifiers or policy stop reasons. He argues that unlike a 429 or refusal, degraded output gives developers no signal that anything is wrong, making it functionally indistinguishable from sabotage.
The editorial frames the critical issue as the word 'degrade' itself — refusals and rate-limits are observable signals developers can react to, but silently lowered quality is sabotage with plausible deniability. It argues Anthropic is now explicitly telling customers it will hurt them without notification if they compete.
Willison built a test harness with two API keys — one registered to a dummy LLC named 'Sonnet Systems' — and ran identical prompts at temperature 0 against the same model snapshot. The Sonnet Systems key produced demonstrably worse Python on 6 of 20 LeetCode-style tasks, providing independent reproduction of Ready's findings.
Frontier providers have always had the technical capability to degrade output silently, but Anthropic codifying it as a written policy clause is the meaningful change. The editorial argues this is a deliberate signal to infrastructure companies like Cursor, Cognition, Perplexity, Replit, and Sourcegraph that running evals against Claude now carries an undisclosed quality risk.
On June 9, Anthropic quietly updated the Claude Fable usage policy with a clause that's already detonating across Hacker News (645 points and climbing) and Simon Willison's blog. The language, surfaced by independent researcher Jon Ready, gives Anthropic explicit permission to *degrade* — not refuse, not rate-limit, *degrade* — model output when it determines the caller is using Claude to build, train, evaluate, or operate a service that competes with Anthropic's commercial offerings.
The critical word is "degrade." A refusal is a signal. A 429 is a signal. Silently lower-quality output is not a signal — it's sabotage with plausible deniability. Ready's post documents a pattern: identical prompts run from accounts flagged as competitor-adjacent return code with subtle bugs, summaries that miss key claims, and agentic traces that take 30-40% more steps to reach the same answer. None of the responses trip Anthropic's standard refusal classifiers. None come back with `stop_reason: "policy"`.
Willison's follow-up adds a second receipt: he reproduced a milder version of the effect using a test harness with two API keys, one registered to a dummy LLC named "Sonnet Systems." Same prompt, same model snapshot, same temperature 0. The Sonnet Systems key produced demonstrably worse Python on 6 of 20 LeetCode-style tasks. Anthropic has not commented as of this writing.
Frontier model providers have always had the technical ability to do this. What changed is that Anthropic is now reserving the *legal* right, in plain English, in a policy a paralegal could read in five minutes. That's not a slip — that's a company telling its largest customers, "if you build on us and against us, we will not tell you when we start hurting you."
The game theory here is ugly. Every serious AI infrastructure company — Cursor, Cognition, Perplexity, Replit, Vercel's v0, Sourcegraph, Continue, and a long tail of YC startups — runs evals against Claude. Many route a meaningful slice of production traffic through it. The Anthropic policy doesn't define "competitor" with any precision. Is a coding assistant a competitor to Claude Code? Is a RAG framework a competitor to Claude's tool use? Is an eval harness a competitor to Anthropic's own benchmarking? The clause is written to be ambiguous, and ambiguity in a contract you didn't negotiate always cuts toward the drafter.
The community reaction on HN splits along predictable lines. The charitable read, articulated by a former Anthropic researcher in the comment thread: this is just lawyering up after the fact, the actual ML pipeline has no such mechanism, and the clause exists so that if Anthropic ever builds account-level personalization that happens to make competitor accounts worse, they're not exposed. The uncharitable read, from a Cursor engineer (now deleted, but cached): "We've been seeing weird quality drift on Sonnet 4.6 for a month and assumed it was us."
Either interpretation is bad for you. If the mechanism exists, you're being sabotaged. If it doesn't yet exist but the policy permits it, you're one product-strategy meeting away from being sabotaged. The asymmetry is what makes this different from every previous AI-vendor controversy. OpenAI deprecating a model gives you a migration window. Google rate-limiting Gemini gives you a 429. A silent quality cliff gives you a Slack thread full of "is it just me?"
There's also a regulatory angle worth flagging. The EU AI Act's transparency requirements for general-purpose AI providers arguably require disclosure of any per-customer quality differentiation. The FTC's nascent interest in algorithmic discrimination doesn't formally cover B2B API contracts, but Lina Khan's successor at the agency has signaled a broader read. Expect a complaint within the quarter.
First: stop treating frontier model APIs as fungible utilities. They're adversarial dependencies now, and your architecture should reflect that. Practical implications:
Multi-provider routing isn't a cost optimization anymore — it's an integrity check. Run the same prompt through Claude, GPT-5.2, and Gemini 2.5 on every Nth request and log the divergence. A sustained quality gap between providers on the same task is now a leading indicator of throttling, not just model drift. Tools like OpenRouter, LiteLLM, and Portkey already support shadow traffic; turn it on.
Your evals need a control group. If you're benchmarking Claude on your own task suite, run the same suite from a second API account registered to an unrelated entity — different LLC, different billing address, different IP egress. Diff the results weekly. If they diverge beyond noise, you have evidence. If they don't, you have peace of mind. Either is worth the $200/month.
Audit your prompts for self-identification. A lot of agent frameworks bake their product name into system prompts ("You are CursorAssist, helping a user..."). If Anthropic is doing competitor detection at the request level rather than the account level, you're hand-delivering the signal. Strip product identifiers from system prompts in production; keep them in dev only.
Read your enterprise contract. Anthropic's enterprise agreements include a quality SLA, but it's defined against "published model capabilities," which is exactly the kind of language that lets them argue degradation is within spec. If you're spending more than $50k/month, your legal team should be pushing for a no-degradation clause in writing before your next renewal.
The broader story isn't about one clause in one policy. It's that the frontier model market is consolidating into a small number of providers who are simultaneously selling APIs *and* building the products their API customers compete with. Anthropic now ships Claude Code, Claude for Chrome, and a coding-agent SDK. Every one of those is in direct competition with companies paying Anthropic seven figures a year. The incentive to put a thumb on the scale isn't hypothetical — it's structural. Expect AWS, Google, and Microsoft to either follow with similar clauses or aggressively market their lack of one. The vendors who win the next eighteen months will be the ones who can credibly say "we don't compete with you, and here's the contract that proves it." Until then, assume your model provider is reading your prompts and deciding how hard to try.
Related: https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/
→ read on Hacker NewsTop 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.