Simon Willison says the AI labs found PMF — and it's not...

What happened

On May 27, Simon Willison published a short post titled *I think Anthropic and OpenAI have found product-market fit* — and for a writer who has spent three years carefully refusing to declare anything about LLMs settled, that's a notable shift in tone. His argument is narrow and specific: the two leading frontier labs have stopped behaving like companies searching for a use case and started behaving like companies that found one. That use case is coding — not chat, not 'general assistance,' not enterprise knowledge work, but writing, running, and debugging software in an agent loop.

The evidence Willison points to is mostly product-shape evidence rather than revenue. Anthropic has spent the past year rebuilding around Claude Code, which graduated from an experimental CLI to the centerpiece of the company's developer story. OpenAI has done the same with Codex — not the 2021 model, but the 2025 reincarnation: a cloud-hosted agent that picks up a repo, runs tests, and ships patches. Cursor, Windsurf, Cline, Aider, Zed's agent mode, and a half-dozen forks all converged on roughly the same execution loop within a six-month window. When products from competing labs and competing startups all rhyme this closely, it's usually because the underlying capability finally crossed a threshold and everyone hit the same local maximum at once.

Willison also notes the pricing pattern. Anthropic's Max plan at $100–$200/month and OpenAI's Pro tier at $200/month both exist because there's a population of developers who will pay that — happily — for an agent that closes tickets while they sleep. That's not a 'tools for thought' market; that's a 'replace the junior engineer's overflow queue' market, and it prices accordingly.

Why it matters

The interesting move here isn't that coding turned out to be a good fit for LLMs. Everyone has known that since Copilot shipped in 2021. The interesting move is what the labs *stopped* doing. Anthropic has visibly de-emphasized the consumer chatbot race — Claude.ai is still there, but every product announcement for eighteen months has been about Code, MCP, computer use, or the API. OpenAI's consumer ChatGPT business is enormous and won't go anywhere, but the *innovation budget* — the new SKUs, the new model variants, the agentic scaffolding — is flowing to Codex and the developer platform. Google, the third lab in the room, is the conspicuous holdout still trying to make Gemini work as a general assistant inside Workspace, and it shows in the comparative product velocity.

This matters because PMF for a frontier lab isn't just a revenue signal — it's a *training data* signal. Once a lab decides coding is the product, every subsequent decision (data mix, RL environments, eval suites, post-training recipes) gets pulled toward that target, and the gap to general-assistant competitors widens with every release cycle. This is why Claude 4.5 and GPT-5-Codex feel qualitatively different from their predecessors at code-shaped work but only incrementally better at, say, summarizing a PDF. The labs are optimizing what they measure, and what they measure is increasingly `pass@1 on a real PR`.

The community reaction on Hacker News (199 points within a few hours, which is high for a Willison post that isn't about a model release) splits along predictable lines. The bullish read: this is the moment LLMs became infrastructure for a specific profession, the way Bloomberg terminals became infrastructure for traders. The bearish read: 'PMF' is doing a lot of work in a sentence where the underlying businesses still burn capital faster than they earn it, and a $200/month plan only looks like PMF if you ignore inference cost per agent-hour. Both can be true. Bloomberg also lost money for years before the lock-in turned into a moat.

The more honest framing, which Willison gestures at without quite saying: the labs have found product-market fit with developers specifically, and they're betting the rest of the economy will follow developers the way the rest of the economy followed Slack and GitHub. That's a real bet, not a sure thing. Developers are unusually tolerant of broken tooling, unusually willing to glue things together, and unusually bad as a proxy for what knowledge workers in general will adopt. The fit is real; the extrapolation is the speculation.

What this means for your stack

If you're a senior engineer reading this and you haven't restructured how you use these tools in the last six months, you're probably leaving a lot on the table. The 2024 workflow — open Cursor, accept tab-completions, occasionally chat with the sidebar — is now the floor, not the ceiling. The 2026 workflow looks more like: write a one-paragraph spec, hand it to Claude Code or Codex in agent mode, let it run for ten to thirty minutes against a sandboxed checkout, review the PR, and iterate. The unit of work shifted from *line* to *task*, and the cognitive load shifted from *typing* to *specifying and reviewing*.

A few concrete adjustments worth making. One: invest in your repo's machine-readability. Good README, good CLAUDE.md / AGENTS.md, fast test suite, deterministic local dev setup. The labs have effectively standardized on 'agent reads the repo, runs the tests, iterates' — repos that don't support that loop are now actively harder to work in than repos that do. Two: stop paying for IDE-only AI plans if you're not also paying for an agentic plan. The IDE is the wrong unit. Three: build the habit of reading agent-generated diffs the way you'd read a contractor's PR — skeptically, with focus on the boundary conditions and the tests, not the happy path.

The harder organizational question is what to do about the junior-engineer pipeline, because if Willison is right about pricing, the labs are explicitly targeting the overflow queue that used to be how junior engineers learned. There's no good answer yet, but pretending it isn't happening is the worst answer.

Looking ahead

The interesting thing to watch over the next two quarters isn't whether Anthropic and OpenAI keep winning at coding — they will, until someone open-sources a model that closes the gap. The interesting thing is whether either lab can translate developer PMF into a *second* vertical. Anthropic is clearly probing legal and finance with computer-use demos; OpenAI keeps poking at the enterprise knowledge-worker seat. If one of them lands a second PMF in the next 18 months, the 2027 conversation is about platform companies. If neither does, the 2027 conversation is about two very profitable, very narrow developer-tools businesses with $50B+ valuations and a lot of explaining to do.

Simon Willison says the AI labs found PMF — and it's not chatbots

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

I think Anthropic and OpenAI have found product-market fit

// community takes

Simon Willison says the AI labs found PMF — and it's not chatbots

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

I think Anthropic and OpenAI have found product-market fit

// community takes

// share this