Lawson reports that heavy use of Claude, Cursor, and Copilot has reduced his throughput while improving his code quality, and he considers this the right trade-off. He treats every AI suggestion as a hypothesis from an unreliable junior engineer — hand-writing plans before prompting, reading diffs line-by-line, and rewriting generated tests that pass rather than probe.
Lawson argues that the industry's dominant framing — Cursor's 'ship faster,' Copilot's accepted-suggestion rates, Devin's tasks-per-hour — optimizes for a metric that demos well but misses the point. He doesn't reject the tools, just the throughput-centric KPI vendors and finance departments have converged on.
The top comments on the HN thread — staff engineers, tech leads, and infra practitioners — describe converging independently on Lawson's pattern. The recurring observation: experienced engineers report slowing down with AI to maintain quality, while less experienced coders report speeding up, suggesting the productivity gains are inversely correlated with existing skill.
Nolan Lawson — ex-Microsoft Edge engineer, longtime Mastodon contributor, and one of the more sober voices on web performance — published *Using AI to write better code more slowly* on May 25. The post climbed to 1,074 points on Hacker News within a day, which for a personal blog with no launch, no product, and no hot-take headline is a signal worth paying attention to.
The argument is structurally simple and tonally unfashionable: Lawson uses Claude, Cursor, and Copilot heavily — and reports that his throughput has gone *down*, his code quality has gone *up*, and he considers this the correct trade. He describes a workflow where every AI suggestion is treated as a hypothesis from a confident but unreliable junior engineer. Plans are written by hand before prompting. Diffs are read line-by-line, often re-prompted three or four times. Generated tests are deleted and rewritten because the model writes tests that pass rather than tests that probe. The net effect is that a 20-minute task becomes a 35-minute task, and the resulting PR is one Lawson is willing to put his name on.
The HN thread is unusually free of the usual AI-coding flame war. The top comments are practitioners — staff engineers, tech leads, infra folks — who describe converging on the same pattern independently. The common thread: people who already knew how to code well report slowing down with AI; people who didn't report speeding up.
The dominant marketing pitch for AI coding tools is throughput. Cursor's homepage talks about "shipping faster." GitHub's Copilot metrics lead with accepted-suggestion rates. Cognition's Devin demos optimize for tasks-per-hour. Every vendor has converged on the same KPI because it's the one that demos well and the one finance departments understand.
Lawson's piece is interesting because it rejects the KPI without rejecting the tool. He's not arguing AI coding is bad — he's arguing that measuring it by speed is the same category error as measuring a senior engineer by lines of code per day. The value of a senior engineer isn't volume; it's judgment compounded over the lifetime of the codebase. If the AI removes the typing tax but you spend the savings on more typing instead of more thinking, you've optimized the wrong variable.
This lines up with what's actually showing up in the data. METR's randomized controlled trial published in late 2025 found that experienced open-source maintainers were on average 19% slower when using AI assistants, despite believing themselves to be ~20% faster. Stack Overflow's 2026 developer survey shows trust in AI-generated code has dropped for the second year running, from 43% in 2024 to 29% now, even as adoption climbs past 80%. The split between "I use it constantly" and "I don't trust what it gives me" is the defining cognitive dissonance of the current era of software engineering.
The community reaction is also instructive about what's changed. Two years ago a post like this would have been dismissed as a Luddite take or a humblebrag. Now the top reply is a Shopify principal engineer saying, in effect, *yes, and we've started measuring "PRs that survive 90 days without a follow-up fix" as a quality counter-metric to acceptance rate.* The conversation has matured past "AI good" vs "AI bad" into the much more useful question of *what process discipline survives contact with a tireless code generator.*
If you lead an engineering team, the most actionable read is to stop reporting Copilot acceptance rates in your weekly metrics. They measure the wrong thing and they create the wrong incentives. Replace them with a paired counter-metric — revert rate, post-merge defect rate, or change-failure rate on AI-touched PRs versus human-only PRs. If the counter-metric stays flat while throughput rises, great. If it climbs, you've discovered that your speedup is technical debt with a faster compile time.
For individual contributors, Lawson's workflow is portable. The pattern is: write the plan yourself in plain English before you prompt; require the model to explain *why* before you accept a diff; never accept generated tests without rewriting the assertions; and treat refactors as the one place to absolutely not use AI, because the model will helpfully "clean up" invariants you didn't realize you were depending on. This is roughly the workflow that Simon Willison, Armin Ronacher, and Mitchell Hashimoto have all converged on in their own writing over the last six months — different vocabularies, same shape.
The contrarian read is that this advice only applies to people who already have the taste to know what good code looks like. Junior engineers using AI may genuinely be faster *and* produce passable code, because the model's output is a strict upgrade on what they'd have written themselves. That's true, and it's also the reason the next five years of hiring is going to be brutal: the gap between someone who can review AI output and someone who can only produce it is the new seniority signal, and it's not something you can fake in an interview.
The interesting second-order question is what this does to the agentic-coding pitch. Tools like Devin, Claude Code's autonomous mode, and Cursor's background agents are betting on the opposite premise — that the human reviewer should be removed from the inner loop entirely. Lawson's post is, implicitly, the case against that bet for any code that has to live longer than a sprint. Whether the next 18 months prove him right or wrong will depend less on model capability and more on whether the industry develops the discipline to measure code by what it costs to maintain, not what it cost to write.
I do something similar. Design reviews are an extremely valuable part of the process but they take a long time and people are busy and who really has enough expertise to provide meaningful feedback? Well, the AI is always available and it can rip through my entire codebase in just a few minutes and
This article doesn't address writing code with AI, just code review. My issue with agentic coding is that I make numerous micro-architectural decisions while programming. I almost never have a full spec up front and develop one as I consider what I am writing.When using Claude Code or Codex, th
As a junior, i do actually enjoy going back and forth with the AI discussing different ways to implement something and exploring alternatives.More often than not, I'd have an architectural idea that I'm not that confident in. The process of talking with the LLM takes a long time but it hel
I find myself spending on average more time in LLM review/resolution loops than it would take for me to write the code by hand. Partially because once I'm in the flow I write very very quickly and the code pours out sometimes faster than I can write. But also because the LLM code on the fi
Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.
I've hit this point with AI where it's not a simple process, but a long drawn out back and forth.I'll use AI to design the implementation of a medium sized, cross cutting feature. Review all the details, maybe iterate on just that. Then implement with Claude 4.7 Max - which runs slowe