Everything in C is undefined behavior

What happened

Thomas Habets' blog post "Everything in C is undefined behavior" hit the Hacker News front page this week with 351 points and a comment thread that reads like a group therapy session for systems programmers. The thesis is uncomfortably simple: the C standard contains so many traps marked *undefined behavior* — over 200 distinct ones in C17, depending on how you count — that any program of meaningful size will hit at least one. And once you've hit one, the compiler is allowed to do anything: emit the code you wrote, delete it, replace it with `ud2`, or rewrite the surrounding logic as if the UB-triggering branch were unreachable.

Habets walks through the usual suspects — signed integer overflow, shifting by the width of the type, reading uninitialized memory, strict aliasing violations, modifying a string literal — and then keeps going. Pointer arithmetic across allocation boundaries is UB. Comparing pointers from different objects with `<` is UB. Calling `memcpy` with a null source pointer and a length of zero is UB. Even `INT_MIN % -1` is UB on most platforms. He notes that LLVM and GCC will, when they detect these, happily prune entire control-flow paths.

The post lands in the middle of a longer-running argument. Regehr, Lattner, and others have been making versions of this point for fifteen years. What makes Habets' framing land is the inversion: instead of "avoid these patterns," he asks you to find a non-trivial C program that provably contains zero UB. He can't. Neither can the commenters. Neither, importantly, can Linus Torvalds, whose `-fno-strict-aliasing -fno-delete-null-pointer-checks` flag list in the kernel Makefile reads like a confession.

Why it matters

The practitioner-relevant point is not "C is bad." The point is that the language you're compiling is not the language you think you're writing. Modern optimizers treat UB as a precondition, not a warning: if your code would only matter when UB is triggered, the optimizer is allowed to delete it. That's how the infamous CVE-2009-1897 happened — a null check in `tun_chr_poll` was removed by GCC because an earlier dereference "proved" the pointer was non-null. The kernel had a privilege escalation for months because of an optimization the standard explicitly permits.

This is also why the Rust-vs-C debate keeps refusing to die. Rust's safety story isn't just about memory; it's about specification. The C standard is a contract between you and the compiler, and the compiler has the better lawyers. Every version of GCC and Clang since around 2010 has gotten more aggressive about exploiting UB for optimization, and there is no indication that trend is reversing. LLVM 19 (shipped late 2024) added new UB-exploiting passes around `freeze` and poison propagation that surprised even longtime contributors.

The community reactions in the HN thread split predictably. The C-defenders argue, correctly, that most UB is avoidable with discipline, modern tooling (UBSan, ASan, MSan, TSan), and `-fsanitize=undefined` in CI. They point to OpenSSH, SQLite, and the Linux kernel as proof that disciplined C is shippable. The skeptics argue, also correctly, that "don't write bugs" has never been a working strategy at scale, and that the cost of UB-induced bugs — Heartbleed, Shellshock, dirty COW, the recent xz backdoor's exploitation of C semantics — has been borne by users, not vendors. Both camps are right; they're just optimizing for different costs.

The more interesting reaction came from the formal-methods corner. Projects like CompCert (a formally verified C compiler) and the K Framework's C semantics have spent years trying to nail down what a "reasonable" subset of C actually means. Their answer, roughly: a much smaller language than what `gcc -O2` accepts. MISRA C, CERT C, and the C Secure Coding Standard exist because the actual C standard is, in practice, unimplementable as written without escape hatches.

What this means for your stack

If you ship C or C++ in production, three things should be non-negotiable in 2026. First, UBSan in CI on every PR, with `-fsanitize=undefined,address,integer` at minimum. The runtime cost is real (2-3x slowdown is typical), but you only pay it in test. Google's oss-fuzz has caught thousands of UB bugs this way in projects whose maintainers swore their code was clean.

Second, hardening flags by default: `-D_FORTIFY_SOURCE=3`, `-fstack-protector-strong`, `-fstack-clash-protection`, `-fcf-protection=full`, `-Wl,-z,relro,-z,now`. These don't fix UB, but they convert a class of UB-triggered exploits into crashes. The Linux kernel hardening project and the BSDs have been pushing these for years; if your build system doesn't have them, you are downstream of someone else's risk tolerance.

Third, stop writing new C where you have a choice. The CISA, NSA, and the White House ONCD have all published statements in the past two years recommending memory-safe languages for new development; insurance carriers are starting to ask about it during cyber-policy renewals. If you're starting a greenfield systems project today and you choose C over Rust, Zig, or Go for performance reasons, you should have a benchmark, not a vibe. For existing C codebases, the realistic path is incremental: new modules in Rust with C FFI, gradually displacing the perimeter. The Linux kernel, Android, and Windows have all picked this path. The pattern works.

For library authors specifically: assume your callers are hostile to your invariants. Use `_Generic`, `static_assert`, and `[[nodiscard]]` aggressively. Document UB preconditions in the API, not just the manpage. If your function dereferences a pointer, say so. If it requires aligned input, say so. The standard won't help you; your header file is the only contract that's actually read.

Looking ahead

C isn't going anywhere — there's too much of it, too much tooling around it, and too many ABIs frozen to its semantics. But the era of treating C as a "portable assembly language" is over and has been since the optimizer caught up. The honest framing for 2026 and beyond: C is a high-level language with a permissively-specified semantics that the compiler exploits for performance. Treat it like that, instrument it like that, and budget for the bugs like that. Habets' post is uncomfortable not because it's wrong, but because anyone who's spent a weekend debugging a `-O2`-only crash already knows it's right.

Everything in C is undefined behavior — and the compiler knows it

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

// community takes

Everything in C is undefined behavior — and the compiler knows it

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

Everything in C is undefined behavior

// community takes

// share this