You Didn't Get a UUID Collision. Here's What Actually Ha...

What Happened

A developer posted to Hacker News with the kind of subject line that makes infrastructure engineers spit out their coffee: "We just had an actual UUID v4 collision." The post, which racked up 337+ points, described a production database flagging a duplicate UUID — one record from 2025 and a fresh insert in 2026 generating the identical value: `b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd`. The developer was using the popular npm `uuid` package and had already ruled out a double-insert bug.

The community response was swift, predictable, and largely correct: no, you didn't.

Not because UUID collisions are physically impossible, but because the probability is so astronomically low that a broken random number generator, a VM snapshot issue, or an application-level bug is always — *always* — the more likely explanation.

The Math That Makes This (Almost) Impossible

UUID v4 uses 122 bits of randomness, producing approximately 5.3 × 10³⁶ possible values. To put that in perspective: if every human on Earth generated a billion UUIDs per second, it would take roughly 100 years to reach a 50% probability of a single collision. The birthday paradox threshold sits at ~2.71 × 10¹⁸ UUIDs — 2.71 quintillion. Most production databases contain millions to low billions of records.

At a million records, your collision probability is roughly 1 in 10²⁴. You are more likely to be struck by lightning while being eaten by a shark during a meteorite impact. The universe does not owe you a UUID collision; it owes you a bug report.

This isn't an appeal to incredulity. It's Bayesian reasoning. When the prior probability of a true collision is 10⁻²⁴ and the prior probability of a software bug is, charitably, 10⁻², the bug hypothesis wins by twenty-two orders of magnitude.

The Usual Suspects

Every confirmed UUID "collision" in the wild has traced back to one of a handful of root causes. Understanding them is more useful than debating the math.

Broken PRNGs. The most common culprit. JavaScript's `Math.random()` is not cryptographically secure — some older engines used xorshift128+ with predictable seeding. If UUID generation falls back to `Math.random()` instead of `crypto.getRandomValues()`, collision rates skyrocket from astronomical to merely improbable. The npm `uuid` package (v4+) uses `crypto.getRandomValues()` by default in Node.js, but older versions or custom implementations may not. If your UUID generator doesn't use a CSPRNG, you don't have UUIDs — you have plausible-looking strings.

VM snapshot cloning. When a virtual machine is cloned from a snapshot, the PRNG state is cloned with it. Two VMs resume from identical internal state, and their next N random values are identical. This has caused real-world UUID collisions in cloud environments, particularly with auto-scaling groups that launch instances from a golden AMI. The fix is to reseed the entropy pool after clone — most modern hypervisors do this, but "most" isn't "all."

Docker and low-entropy environments. Containers, especially during early boot, can have insufficient entropy in `/dev/random`. The kernel's entropy pool starts near-empty, and if your application generates UUIDs before enough environmental noise accumulates, the randomness quality degrades. This was a documented issue in early Docker deployments and embedded Linux systems. Modern kernels (5.6+) largely solved this with the `getrandom()` syscall blocking until sufficient entropy is available, but legacy setups persist.

Application-level bugs. Race conditions in concurrent inserts, retry logic that reuses a previously generated UUID, ORM caching that serves stale objects, database replication conflicts that surface the same UUID on different nodes — these are mundane but extremely common. The developer who rules out a "double-insert bug" in five minutes of checking has almost certainly not finished checking.

Database-level issues. Some ORMs generate the UUID client-side and cache it. Some connection poolers have transparent retry logic. Some replication topologies can surface the same row twice. The UUID isn't colliding; the insert is duplicating.

Why This Story Keeps Recurring

This is not the first "I got a UUID collision" post on HN, and it won't be the last. These posts surface roughly once or twice a year, and they follow a pattern: an alarmed developer posts evidence, the community dissects it, and within a day or two, a root cause emerges that isn't random chance.

The pattern persists because UUID v4 occupies an unusual epistemic position: its safety guarantee is probabilistic, not deterministic, and most developers understand the probability poorly. When you tell someone "the odds are 1 in 10²⁴," they hear "not zero" and pattern-match to lottery winners and rare diseases. The distinction between "not zero" and "effectively zero for the lifetime of the observable universe" is genuinely hard to internalize.

The useful framing isn't "can UUIDs collide?" but "what would have to be broken for this to happen?" — and that question always leads you somewhere productive.

What This Means for Your Stack

If your system ever reports a UUID collision, treat it as a high-priority infrastructure alert, not a curiosity. Something is genuinely wrong — just not the thing you think.

Here's a diagnostic checklist:

1. Verify the UUID source. Confirm your generator uses `crypto.getRandomValues()` (browser/Node), `os.urandom()` (Python), `SecureRandom` (Java/Ruby), or equivalent CSPRNG. If you find `Math.random()`, `rand()`, or a custom implementation anywhere in the chain, you've found your bug.

2. Check for VM/container cloning artifacts. If you run auto-scaling groups, spot instances, or container orchestration that launches from snapshots, verify that entropy is reseeded post-clone. On Linux, check `cat /proc/sys/kernel/random/entropy_avail` — anything below 256 is a red flag.

3. Audit the insert path. Trace the exact code path from UUID generation to database insert. Look for retry logic, ORM caching, message queue redelivery, or anything that could reuse a previously generated value.

4. Check database-level uniqueness enforcement. If your UUID column doesn't have a `UNIQUE` constraint, you might have had duplicates for months without noticing. Add one. If it fails on existing data, you've found the real scope of the problem.

5. Consider UUIDv7 if ordering matters. UUIDv7 (RFC 9562, finalized 2024) embeds a millisecond-precision timestamp in the first 48 bits, giving you chronological sorting while retaining 74 bits of randomness. The collision space is smaller per-millisecond, but the practical benefit of sortable, index-friendly IDs is substantial for most applications.

Looking Ahead

The UUID collision panic is a perennial developer rite of passage, and it's actually healthy — it forces you to audit randomness sources, insert paths, and infrastructure assumptions you'd otherwise never question. The developer in this thread may have found a real bug that was silently corrupting data, and the UUID collision was the canary that caught it. That's the optimistic read. The pessimistic read is that most codebases never add a UNIQUE constraint on their UUID column, and the collisions that don't get caught are the ones you should worry about.

You Didn't Get a UUID Collision. Here's What Actually Happened.

// tldr

// viewpoints

// deep dive

What Happened

The Math That Makes This (Almost) Impossible

The Usual Suspects

Why This Story Keeps Recurring

What This Means for Your Stack

Looking Ahead

// read from source

Ask HN: We just had an actual UUID v4 collision...

// community takes

You Didn't Get a UUID Collision. Here's What Actually Happened.

// tldr

// viewpoints

// deep dive

What Happened

The Math That Makes This (Almost) Impossible

The Usual Suspects

Why This Story Keeps Recurring

What This Means for Your Stack

Looking Ahead

// read from source

Ask HN: We just had an actual UUID v4 collision...

// community takes

// share this