The editorial argues that with 122 bits of randomness yielding ~5.3 × 10³⁶ possible values, the collision probability at typical database sizes is around 10⁻²⁴. Applying Bayesian reasoning, a software bug (prior ~10⁻²) beats a true collision hypothesis by twenty-two orders of magnitude, making the bug explanation overwhelmingly more likely.
The original poster reports that their database flagged a duplicate UUID v4 — one record from 2025 and a new insert in 2026 producing the identical value b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd. They claim to have already ruled out a double-insert bug and were using the standard npm uuid package, presenting this as a real collision.
The editorial notes that every confirmed UUID 'collision' in the wild has traced back to a handful of known root causes: broken pseudorandom number generators, VM snapshot restores that replay the same entropy state, or application-level bugs. Understanding these failure modes is more productive than debating whether the math permits a true collision.
A developer posted to Hacker News with the kind of subject line that makes infrastructure engineers spit out their coffee: "We just had an actual UUID v4 collision." The post, which racked up 337+ points, described a production database flagging a duplicate UUID — one record from 2025 and a fresh insert in 2026 generating the identical value: `b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd`. The developer was using the popular npm `uuid` package and had already ruled out a double-insert bug.
The community response was swift, predictable, and largely correct: no, you didn't.
Not because UUID collisions are physically impossible, but because the probability is so astronomically low that a broken random number generator, a VM snapshot issue, or an application-level bug is always — *always* — the more likely explanation.
UUID v4 uses 122 bits of randomness, producing approximately 5.3 × 10³⁶ possible values. To put that in perspective: if every human on Earth generated a billion UUIDs per second, it would take roughly 100 years to reach a 50% probability of a single collision. The birthday paradox threshold sits at ~2.71 × 10¹⁸ UUIDs — 2.71 quintillion. Most production databases contain millions to low billions of records.
At a million records, your collision probability is roughly 1 in 10²⁴. You are more likely to be struck by lightning while being eaten by a shark during a meteorite impact. The universe does not owe you a UUID collision; it owes you a bug report.
This isn't an appeal to incredulity. It's Bayesian reasoning. When the prior probability of a true collision is 10⁻²⁴ and the prior probability of a software bug is, charitably, 10⁻², the bug hypothesis wins by twenty-two orders of magnitude.
Every confirmed UUID "collision" in the wild has traced back to one of a handful of root causes. Understanding them is more useful than debating the math.
Broken PRNGs. The most common culprit. JavaScript's `Math.random()` is not cryptographically secure — some older engines used xorshift128+ with predictable seeding. If UUID generation falls back to `Math.random()` instead of `crypto.getRandomValues()`, collision rates skyrocket from astronomical to merely improbable. The npm `uuid` package (v4+) uses `crypto.getRandomValues()` by default in Node.js, but older versions or custom implementations may not. If your UUID generator doesn't use a CSPRNG, you don't have UUIDs — you have plausible-looking strings.
VM snapshot cloning. When a virtual machine is cloned from a snapshot, the PRNG state is cloned with it. Two VMs resume from identical internal state, and their next N random values are identical. This has caused real-world UUID collisions in cloud environments, particularly with auto-scaling groups that launch instances from a golden AMI. The fix is to reseed the entropy pool after clone — most modern hypervisors do this, but "most" isn't "all."
Docker and low-entropy environments. Containers, especially during early boot, can have insufficient entropy in `/dev/random`. The kernel's entropy pool starts near-empty, and if your application generates UUIDs before enough environmental noise accumulates, the randomness quality degrades. This was a documented issue in early Docker deployments and embedded Linux systems. Modern kernels (5.6+) largely solved this with the `getrandom()` syscall blocking until sufficient entropy is available, but legacy setups persist.
Application-level bugs. Race conditions in concurrent inserts, retry logic that reuses a previously generated UUID, ORM caching that serves stale objects, database replication conflicts that surface the same UUID on different nodes — these are mundane but extremely common. The developer who rules out a "double-insert bug" in five minutes of checking has almost certainly not finished checking.
Database-level issues. Some ORMs generate the UUID client-side and cache it. Some connection poolers have transparent retry logic. Some replication topologies can surface the same row twice. The UUID isn't colliding; the insert is duplicating.
This is not the first "I got a UUID collision" post on HN, and it won't be the last. These posts surface roughly once or twice a year, and they follow a pattern: an alarmed developer posts evidence, the community dissects it, and within a day or two, a root cause emerges that isn't random chance.
The pattern persists because UUID v4 occupies an unusual epistemic position: its safety guarantee is probabilistic, not deterministic, and most developers understand the probability poorly. When you tell someone "the odds are 1 in 10²⁴," they hear "not zero" and pattern-match to lottery winners and rare diseases. The distinction between "not zero" and "effectively zero for the lifetime of the observable universe" is genuinely hard to internalize.
The useful framing isn't "can UUIDs collide?" but "what would have to be broken for this to happen?" — and that question always leads you somewhere productive.
If your system ever reports a UUID collision, treat it as a high-priority infrastructure alert, not a curiosity. Something is genuinely wrong — just not the thing you think.
Here's a diagnostic checklist:
1. Verify the UUID source. Confirm your generator uses `crypto.getRandomValues()` (browser/Node), `os.urandom()` (Python), `SecureRandom` (Java/Ruby), or equivalent CSPRNG. If you find `Math.random()`, `rand()`, or a custom implementation anywhere in the chain, you've found your bug.
2. Check for VM/container cloning artifacts. If you run auto-scaling groups, spot instances, or container orchestration that launches from snapshots, verify that entropy is reseeded post-clone. On Linux, check `cat /proc/sys/kernel/random/entropy_avail` — anything below 256 is a red flag.
3. Audit the insert path. Trace the exact code path from UUID generation to database insert. Look for retry logic, ORM caching, message queue redelivery, or anything that could reuse a previously generated value.
4. Check database-level uniqueness enforcement. If your UUID column doesn't have a `UNIQUE` constraint, you might have had duplicates for months without noticing. Add one. If it fails on existing data, you've found the real scope of the problem.
5. Consider UUIDv7 if ordering matters. UUIDv7 (RFC 9562, finalized 2024) embeds a millisecond-precision timestamp in the first 48 bits, giving you chronological sorting while retaining 74 bits of randomness. The collision space is smaller per-millisecond, but the practical benefit of sortable, index-friendly IDs is substantial for most applications.
The UUID collision panic is a perennial developer rite of passage, and it's actually healthy — it forces you to audit randomness sources, insert paths, and infrastructure assumptions you'd otherwise never question. The developer in this thread may have found a real bug that was silently corrupting data, and the UUID collision was the canary that caught it. That's the optimistic read. The pessimistic read is that most codebases never add a UNIQUE constraint on their UUID column, and the collisions that don't get caught are the ones you should worry about.
I know what you're thinking... and I still can't believe it, but...<p>This morning, our database flagged a duplicate UUID (v4). I checked, thinking it may have been a double-insert bug or so
→ read on Hacker NewsFunny story no one will believe, but it’s true. A good friend of mine joined a startup as CTO 10 years ago, high growth phase, maybe 200 devs… In his first week he discovered the company had a microservice for generating new UUIDs. One endpoint with its own dedicated team of 3 engineers …including a
This is usually caused by an insufficently seeded PRNG.Are you generating the UUID in the backend, or the frontend? Frontend is fundamentally unreliable for many reasons, including deliberate collisions. So if that case you'll need to handle collisions somehow. Though you can still engineer aro
This reminds me of a passage from the book "Pro Git".<https://git-scm.com/book/en/v2>"Here’s an example to give you an idea of what it would take to get a SHA-1 collision. If all 6.5 billion humans on Earth were programming, and every second, each one wa
Some discussion here:https://github.com/uuidjs/uuid/issues/546Eg:> FWIW, I just tested crypto.getRandomValues() behavior on googlebot and it is also deterministic(!)
Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.
This is surprisingly common.The security of UUIDv4 is based on the assumption of a high-quality entropy source. This assumption is invalidated by hardware defects, normal software bugs, and developers not understanding what "high-quality entropy" actually means and that it is required for