AI Writes Code Faster. You Review It Slower. The Net Is Negative.

4 min read 1 source clear_take
├── "AI coding tools should be used to slow down and improve quality, not speed up output"
│  └── Nolan Lawson (nolanlawson.com) → read

Lawson argues that AI assistants like Claude and Cursor haven't made him faster — and that's intentional. He reinvests the typing time saved into writing tests first, reading diffs line-by-line, considering edge cases, and refactoring for clarity, producing better code at lower throughput.

├── "The industry's 'AI makes developers faster' narrative measures generation while ignoring verification"
│  └── Nolan Lawson (nolanlawson.com) → read

Lawson explicitly contrasts his approach with marketing claims like GitHub's 55% faster task completion and Cursor's 'speed of thought' pitch. He cites the July 2025 METR study showing experienced developers were actually 19% slower with Cursor + Claude while believing they were 24% faster — evidence that vendor benchmarks measure code generation, not the verification work where senior engineers spend most of their time.

└── "LLM-generated code collapses the trust prior, making review fundamentally harder than reviewing human code"
  └── top10.dev editorial (top10.dev) → read below

The editorial frames a category error in productivity discourse: when a junior engineer writes 100 lines, reviewers implicitly trust ~80% based on shared mental models and code-review priors. When an LLM writes 100 lines, that trust prior collapses to zero, forcing reviewers to verify every line — which is why generation-speed metrics misrepresent real engineering throughput.

What happened

Nolan Lawson — longtime Mozilla and Salesforce engineer, author of the pouchdb library, and one of the more thoughtful voices in the frontend performance world — published an essay titled *Using AI to write better code more slowly*. The post hit #1 on Hacker News with 904 points and several hundred comments, most of them from working engineers wrestling with the same tension.

Lawson's thesis is unfashionable: AI coding assistants haven't made him faster. They've made him slower — and that's the point. He describes a workflow where Claude and Cursor generate candidate solutions, and he then spends the saved typing time on things he used to skip: writing the test first, reading the diff line-by-line, considering edge cases, refactoring for clarity. The output is better code. The throughput is lower. He considers this a feature.

He explicitly contrasts his approach with the dominant marketing pitch — "ship 10x faster with AI" — and aligns himself with the METR study from July 2025, which found that experienced open-source developers using Cursor + Claude completed real-world tasks 19% slower than without AI tools, despite *believing* they were 24% faster. The gap between perceived and actual productivity is the story.

Why it matters

The industry has spent two years pricing AI coding tools as if they're a pure multiplier on developer output. GitHub claims 55% faster task completion. Cursor's pitch deck talks about "the speed of thought." Anthropic's own benchmarks emphasize lines-per-hour. But every one of those numbers measures generation, not verification — and verification is where senior engineers actually spend their day.

Lawson's framing exposes a category error baked into the productivity discourse. When a junior engineer writes 100 lines of code, the senior reviewer trusts roughly 80% of it implicitly, based on shared mental model and code-review priors. When an LLM writes 100 lines, the trust prior collapses to zero — every variable name, every imported function, every conditional branch needs to be re-derived from first principles, because the model has no shared model with you, no accountability, and a documented tendency to hallucinate APIs that don't exist. The HN comment thread is full of engineers describing the exact same arc: initial euphoria, followed by a debugging session where they discover the AI confidently invented a method signature.

The deeper point, which Lawson makes obliquely and the comments make explicitly: AI doesn't change the total work, it changes where the work lives. You save 30 minutes on boilerplate and spend 45 minutes verifying that the boilerplate doesn't have a subtle off-by-one in the pagination logic. If you skip the verification — which is what most "10x productivity" demos quietly do — you're not faster, you're just shipping more bugs faster.

This lines up with what Stack Overflow's 2025 developer survey showed: 76% of devs use AI tools, but only 33% trust their output, down from 43% the year before. Trust is going the wrong direction even as adoption climbs. The people closest to the code are the most skeptical of the code.

There's also a generational fault line embedded in this debate. Engineers with 10+ years of experience tend to use AI the way Lawson does — as a faster draft generator that they then aggressively rewrite. Engineers in their first three years are more likely to ship the first draft, because they lack the priors to know what "wrong" looks like. This is the actual concerning trend, not the abstract "AI will replace devs" panic. The danger isn't that AI writes bad code; it's that AI writes plausible-looking bad code, and plausibility is enough to defeat a reviewer who's also using AI to summarize the diff.

What this means for your stack

If you're a tech lead, the operational implication is to stop measuring AI ROI in lines-per-hour or PRs-per-week. Those metrics will go up. They are also meaningless. Measure instead: post-merge defect rate, time-to-resolve incidents traceable to AI-generated code, and the ratio of review comments per AI-authored PR vs. human-authored. Anecdotally, teams that have started tracking the last metric report 2-3x more review comments on AI PRs — which is either a sign of healthy skepticism or wasted reviewer cycles, depending on whether the comments find real bugs.

For individual contributors, Lawson's prescription is concrete: treat every AI output as if a stranger on the internet just opened a PR. Demand tests. Run the code in isolation before integrating. Read the diff before accepting it. If your IDE's autocomplete UX makes any of these steps harder than tab-to-accept, your tool is optimized for the wrong metric. Cursor's recent agent-mode push, where the AI commits and pushes without human review, is the apotheosis of this misalignment.

The organizational implication is harder. If verification time is the real bottleneck, then the highest-leverage AI investment isn't a better code generator — it's a better diff reviewer, a better test generator, a better static analyzer that catches the hallucinated API call before it ships. Tools like Greptile, CodeRabbit, and the new wave of "AI reviews AI" startups are arguably more important than the next generation of Copilot. The market hasn't priced this in yet.

Looking ahead

The interesting question for 2026 isn't whether AI will write more code — it obviously will — but whether the industry will admit that writing was never the bottleneck. Lawson's post resonates because it names something most working engineers already know but feel pressured not to say out loud: the productivity gains are smaller than the marketing claims, and the trade-off is real. The teams that adapt fastest will be the ones that stop treating AI as a typing accelerant and start treating it as a hypothesis generator — one whose output requires the same scrutiny as any other untrusted source. Slower, deliberately. Better, measurably.

Hacker News 1198 pts 443 comments

Using AI to write better code more slowly

→ read on Hacker News

// share this

// get daily digest

Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.