Kingsbury argues the debate about whether LLMs are 'good enough' misses the point entirely. The systems engineers rely on — code review, issue triage, peer review, moderation — were all built as cost-asymmetry machines assuming human-scale effort to produce content. LLMs invert that asymmetry, and the resulting flood of fluent-but-unverified text is devouring maintainer time faster than any existing filter can cope.
The editorial cites a pattern of defensive moves across the ecosystem: curl's Daniel Stenberg publicly complaining about AI-generated CVE reports, the Linux kernel tightening contributor policies, and Stack Overflow's still-unrescinded December 2022 ban on unmarked GPT answers. These aren't isolated incidents but early evidence that trusted information sources are being flooded faster than they can filter.
The editorial argues aphyr's essay landed (138 HN points for a meditative personal post) precisely because of his Jepsen background. A career spent formally proving that distributed systems lie gives him unusual standing to diagnose an ecosystem drowning in plausible-sounding text with no verification layer underneath.
Kyle Kingsbury — better known as aphyr, the engineer behind Jepsen's distributed systems torture tests — published a long essay on his blog titled *The Future of Everything Is Lies, I Guess: Where Do We Go from Here?* It hit 138 points on Hacker News, which for a meditative personal post rather than a product launch is a meaningful signal about where the practitioner mood sits right now.
The thesis, stripped to its load-bearing beams: the information substrate that working engineers depend on — documentation, bug trackers, forum answers, search results, code review, even peer-reviewed papers — is being flooded with fluent, confident, plausible-sounding generated text faster than any existing social or technical system can filter it. The problem is not that LLMs occasionally hallucinate; the problem is that producing plausible nonsense is now cheaper than refuting it, and that asymmetry is devouring maintainer time.
Aphyr is not the first person to say this. curl's Daniel Stenberg has been on record for over a year about AI-generated CVE reports eating his weekends. The Linux kernel has tightened contributor policies. Stack Overflow banned unmarked GPT answers in December 2022 and has not walked it back. What makes this essay land is the voice — a person whose entire career is built on formally verifying that systems lie, now watching the rest of the field drown in fluency without truth.
The familiar response to AI content is to argue about quality: is GPT-5 better than a junior, can Claude write a correct sort function, etc. Aphyr's framing cuts under that debate. He's not asking whether the models are good. He's asking what happens to the ecosystems humans built on an implicit assumption that producing a comment, a bug report, or a paper took roughly human-scale effort. Every system we use — code review, issue triage, peer review, moderation — is a cost-asymmetry machine, and we just inverted the cost asymmetry.
Compare the responses so far. The optimistic camp (most VCs, most model labs) says detection and provenance tooling will catch up: watermarking, C2PA, attestation, retrieval-augmented verification. The pessimistic camp — which increasingly includes maintainers who have to live with the output — says watermarking is defeated by paraphrase, C2PA requires platform cooperation that will not arrive, and the economic gradient points the wrong way. Platforms that would filter slop are also the platforms selling slop generation as a feature. The slop is not a bug in the business model; for most of the companies producing it, the slop is the product.
There's a useful analogy buried in here for anyone who has worked on distributed systems, which is probably why aphyr reached for it. Byzantine fault tolerance assumes a bounded fraction of lying nodes. The classical results — PBFT, HotStuff — need honest supermajorities. Our social information protocols (Stack Overflow reputation, GitHub review, arXiv endorsement, Wikipedia consensus) were also implicitly Byzantine systems with honest-majority assumptions. Generative models don't add a few more Byzantine nodes; they let one actor spin up arbitrarily many at near-zero cost. That is not a condition the original protocol was designed for, and patching it after the fact is — to borrow aphyr's own professional experience — historically very hard.
The community reaction on the HN thread leaned toward grim agreement, with one recurring dissent worth engaging: *you always could fake this, spam is old, the internet has been a sewer for decades.* True but incomplete. The cost floor matters. Email spam was cheap but undifferentiated; content farms were scale-limited by human writers; Sybil attacks on forums were bottlenecked by captchas and karma. Generative models collapsed all three constraints into a single line item on an API bill. Quantitative change at three orders of magnitude is a qualitative change, and pretending otherwise is how you end up maintaining curl.
First, tighten your trust graph. If you maintain anything with public contribution, the era of treating a well-formatted bug report as prima facie good-faith is over. Require reproduction steps that exercise actual code paths. Auto-close reports that can't produce a failing test. Several large OSS projects have quietly adopted variants of this and seen triage volume fall without a corresponding loss of real bugs — the signal was always concentrated in reports that came with evidence.
Second, move your own reading upstream. RSS against a curated list of humans beats algorithmic feeds. Primary sources beat summaries. For security specifically, prefer vendor advisories and signed commits to aggregator posts; for language ecosystems, prefer release notes from core maintainers to Medium explainers. This is not Luddism — it is the same discipline you apply when you read a distributed systems paper and check the authors before you check the abstract.
Third, audit your dependencies like the supply chain they are. Star counts are manipulable. Download counts are manipulable. What is harder to fake is a five-year commit history from a named human whose other work you can read. If you cannot, in ten minutes, name a human being responsible for a library you are about to pull into production, that is now a risk signal, not a neutral fact. Package ecosystems from npm to PyPI to crates.io have already seen typosquatting and slop-maintainer takeover incidents; the cost of diligence is lower than the cost of the first incident.
Fourth, assume your own outputs are being scraped into the next model and act accordingly. This is less a security concern than a civic one. If you write good technical prose and publish it openly, you are feeding the same system that is degrading the commons. Aphyr doesn't offer a tidy answer here and neither should anyone else; the tradeoffs are real. But the question deserves to be asked out loud rather than deferred.
Aphyr's essay is deliberately short on prescription — the title ends with a question mark for a reason. The honest forecast is that the near-term equilibrium is worse information, smaller and more defensive communities, and a bifurcated web where signed, known-human content is a premium tier and everything else is treated as suspect by default. That is not a pleasant outcome but it is a legible one, and legibility is the first thing you need before you can build anything. The engineers most likely to come through this with working systems are the ones who are already acting as if the trust substrate is gone — because operationally, for a growing class of problems, it is.
This is a must-read series of articles, and I think Kyle is very much correct.The comparison to the adoption of automobiles is apt, and something I've thought about before as well. Just because a technology can be useful doesn't mean it will have positive effects on society.That said, I&#x
I fear that outside of cataclysmic global warfare or some sort of butlerian jihad (which amounts to the same) this genie is not going back into the bottle.This tech is 100% aligned with the goals of the 0.001% that own and control it, and almost all of the negatives cited by Kyle and likeminded (suc
This reminds me a bit of the ending of In the Beginning Was the Command Line:> The people who brought us this operating system would have to provide templates and wizards, giving us a few default lives that we could use as starting places for designing our own. Chances are that these default live
> ML assistance reduces our performance and persistence, and denies us both the muscle memory and deep theory-building that comes with working through a task by hand: the cultivation of what James C. Scott would callImagine being starting university now... I can't imagine to have learned wha
Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.
"I could retrain, but my core skills—reading, thinking, and writing—are squarely in the blast radius of large language models."Yes.For the lifetime of almost everyone alive now, reading, thinking, and writing have been valued skills which moved one up in society's hierarchy. This is a