The scaling curve bent: what 'AI is slowing down' means ...

What happened

Ed Zitron's latest at Where's Your Ed At, *AI Is Slowing Down*, landed at 497 points on Hacker News with a thesis the room has been circling for months but mostly refused to say out loud: the frontier-model improvement curve has visibly bent, while the capex and inference cost curves have not.

Zitron's specific claims are worth pinning down before the discourse swallows them. He points to the gap between GPT-5 and GPT-4o on standard reasoning suites (MMLU-Pro, GPQA, SWE-bench Verified) being a fraction of the GPT-3.5 → GPT-4 jump, despite roughly an order of magnitude more compute thrown at training. He notes that Anthropic's Claude 4.5 Sonnet and Google's Gemini 2.5 Pro are now clustered within 1–2 points of each other on most public benchmarks — close enough that ordering depends on which subset you cherry-pick. And he hammers on the gap between *demoed* capability and *deployed* economics: OpenAI's projected 2026 cash burn, xAI's gas-turbine-powered Memphis buildout, and Anthropic's reported gross margins all imply the unit economics of inference at the frontier still don't close without enterprise contract leverage.

The HN comments — 1,200+ deep by the time the front page moved on — split predictably. Skeptics pointed to RL-on-reasoning gains (o-series, Claude's extended thinking) as proof the S-curve has another inflection. Practitioners mostly agreed with Zitron's framing but disagreed on the cause: data exhaustion vs. architectural ceilings vs. simply 'we hit the useful part of the distribution and the long tail is hard.'

Why it matters

The interesting move here isn't the prediction. Predicting an AI slowdown in 2026 is a coin flip — the curves are noisy and the labs have unreleased capability. The interesting move is treating 'plateau' as a structural condition to engineer around, rather than a temporary lull to wait out.

For three years the dominant pattern was *upgrade and ship*. You'd build against GPT-4, the next model would drop, your eval scores would jump 8–15 points without code changes, and your roadmap would assume that cadence forever. That pattern is breaking. Claude 4.5 to 4.6 was a 2–3 point bump on most internal evals teams have shared. GPT-5 underwhelmed against the leaks. Gemini 2.5 closed the gap but didn't open a new one. The 'free upgrade' tax break is over.

What replaces it is uglier and more interesting: scaffolding eats the delta. The teams shipping the best AI products in 2026 aren't the ones with privileged access to a smarter base model — everyone has access to roughly the same intelligence ceiling. They're the ones with better retrieval, better tool-use loops, better verification layers, and better human-in-the-loop UX. Cursor, Cognition, and the better Claude Code competitors have demonstrated this repeatedly: the same Sonnet checkpoint produces wildly different product quality depending on harness design.

This matches what compiler people learned in the 1990s. Once single-threaded performance growth slowed, the action moved up the stack — caches, branch prediction, SIMD, then concurrency. The chip stopped getting faster; the people who *used* the chip got smarter. We're at the equivalent inflection for LLM-powered products: the model is the silicon, and the engineering is everything you wrap around it.

The second-order effect Zitron underplays is that this is *bad news for the labs and good news for application developers.* If Sonnet-class intelligence is now commoditized across three vendors, switching cost collapses. Pricing power moves to whoever owns the workflow, not whoever owns the weights. That's why every frontier lab is suddenly shipping IDEs, agents, and 'Code' products — they can read the same balance sheet.

What this means for your stack

Three concrete adjustments if you ship LLM-backed product code:

Stop pricing in capability gains that may not come. If your roadmap has a Q3 feature that assumes 'the next model will handle this' — kill it or rewrite it against today's capabilities. The teams hurt worst by the GPT-5 disappointment were the ones whose 2026 plans depended on it being a step change. Plan for flat capability, treat any improvement as upside.

Invest in evals like they're load tests. When models were improving fast, evals were a nice-to-have because the next checkpoint would fix your regression. With a flat curve, eval infrastructure becomes the primary lever for product quality. Teams that have been treating evals as a chore — running a 50-case suite by hand quarterly — need the equivalent of a CI pipeline: thousands of cases, branched by use case, run on every prompt change, with regression alerts. This is unglamorous and it's where the next year's product wins come from.

Multi-model architecture becomes table stakes. If the labs are within noise of each other, route by cost and latency, not capability. The cheapest correct answer wins. Use Haiku/Flash/GPT-5-mini for high-volume retrieval and classification, reserve Sonnet/Opus-class for the actual reasoning step, and design fallback chains so a single provider outage doesn't take you down. The `claude-cli.js → codex-cli` fallback chain in this codebase isn't aspirational anymore — it's the default architecture.

Looking ahead

Zitron is right that the discourse has gotten ahead of the capability, and the bill for that gap is going to come due in 2026 — for the labs first, then for the funds, then for the application teams whose pitch decks promised AGI-adjacent features. But the engineer's takeaway isn't pessimism. It's that the interesting work just shifted from 'wait for the next model' to 'extract everything from this one.' That's the work that compounds, that's the work that builds product moats, and that's the work the labs can't ship on your behalf. The slowdown, if it's real, is the best thing that's happened to application developers since the API opened.

The scaling curve bent: what 'AI is slowing down' means for your stack

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

AI Is Slowing Down

// community takes

The scaling curve bent: what 'AI is slowing down' means for your stack

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

AI Is Slowing Down

// community takes

// share this