The Curl Maintainer Who Called AI Bug Reports Garbage Just Got Proven Wrong

4 min read 1 source clear_take
├── "AI-driven vulnerability discovery has crossed a credibility threshold with real-world results on hardened targets"
│  ├── Daniel Stenberg (daniel.haxx.se) → read

Stenberg, who has been one of the most vocal critics of AI-generated security reports after receiving floods of fabricated CVEs, publicly acknowledged that Mythos found a genuine curl vulnerability. His use of definitive language ('finds' rather than 'claims') signals a meaningful shift from a maintainer whose skepticism was well-documented and hard-earned through years of triaging hallucinated bug reports.

│  └── @TangerineDream (Hacker News, 288 pts)

Submitted the story to Hacker News where it quickly reached 288 points and 110 comments, reflecting the community's recognition that a confirmed AI-found vulnerability in curl — a project continuously fuzzed by OSS-Fuzz since 2017 with 150+ documented CVEs — represents a harder target than previous AI security wins like SQLite.

├── "This result is significant precisely because previous AI vulnerability reports were mostly noise and wasted maintainer time"
│  └── Daniel Stenberg (daniel.haxx.se) → read

Stenberg's earlier public criticism of AI-generated vulnerability reports — describing fabricated CVEs with non-existent functions and invented memory corruption bugs — establishes the baseline against which this Mythos finding is measured. His call for platforms like HackerOne to address AI security spam underscores that the signal-to-noise ratio of AI vulnerability reports has been abysmal, making this genuine find stand out.

└── "Curl represents a harder and more meaningful test case for AI security tools than prior successes like SQLite"
  └── top10.dev editorial (top10.dev) → read below

The editorial argues that curl is a harder target than SQLite, noting its 28-year history, continuous fuzzing by Google's OSS-Fuzz since 2017, over 150 documented CVEs, and the fact that a single maintainer personally triages every security report. This makes an AI-discovered vulnerability in curl a stronger proof point than Google's Project Big Sleep finding a bug in SQLite in 2024, which skeptics dismissed as a one-off.

What Happened

Daniel Stenberg, the creator and sole lead maintainer of curl — the ubiquitous data transfer tool installed on roughly 20 billion devices — published a blog post on May 11, 2026 acknowledging that an AI tool called Mythos found a genuine vulnerability in curl's codebase. The post, which quickly hit the front page of Hacker News with a score north of 280, is remarkable less for the vulnerability itself and more for who is saying it.

Stenberg has been arguably the most prominent open-source maintainer criticizing AI-generated vulnerability reports. In early 2024, he published a blistering post about receiving fabricated CVE reports from people using LLMs — reports that described non-existent functions, invented memory corruption bugs, and wasted hours of maintainer time triaging hallucinated nonsense. He publicly called on platforms like HackerOne to address the flood of AI-generated security spam. His skepticism was well-earned: curl has been fuzzed continuously by Google's OSS-Fuzz since 2017, has had over 150 documented CVEs across its 28-year history, and Stenberg personally triages every single security report.

So when he titles a blog post "Mythos Finds a Curl Vulnerability" — not "Mythos Claims" or "Mythos Alleges" — the developer community pays attention.

Why It Matters

The significance here isn't one more CVE in curl's long and well-managed security history. It's the pattern this fits into.

In November 2024, Google's Project Big Sleep (formerly Naptime) found a genuine, exploitable buffer underflow in SQLite before it shipped to production. That was widely regarded as the first confirmed case of an AI agent discovering a previously unknown vulnerability in production-grade, heavily audited software. At the time, skeptics (including many security researchers) argued it was a one-off — that SQLite, despite its test suite, had specific patterns that happened to be amenable to LLM-based reasoning.

Curl is a harder target than SQLite for AI vulnerability discovery, because it has been subjected to more continuous, diverse fuzzing than almost any open-source project in existence. OSS-Fuzz alone has generated billions of test inputs for curl. The project has its own extensive fuzzing harnesses. The fact that an AI tool found something new suggests it's operating at a fundamentally different level than traditional fuzzing — likely combining static analysis reasoning, semantic understanding of protocol implementations, and novel input generation strategies that pure coverage-guided fuzzers miss.

This also matters because of the noise-to-signal problem. The open-source maintainer community has been drowning in AI-generated vulnerability reports since ChatGPT went mainstream. The vast majority are worthless — hallucinated function names, imagined buffer overflows, copy-pasted templates with project names swapped in. Multiple CVE numbering authorities have had to tighten their processes. HackerOne and Bugcrowd have both implemented AI-detection filters.

The result has been a reflexive skepticism toward any AI-involved security finding. When everything labeled "AI-found vulnerability" is garbage, even the legitimate finds get dismissed. Stenberg acknowledging Mythos's find publicly is effectively the security equivalent of a Michelin star — it's peer validation from someone with every reason to be skeptical.

What This Means for Your Stack

If you maintain open-source software, the practical takeaway is not "AI security tools work now" in some blanket sense. The gap between tools like Mythos (and Google's Big Sleep) and the average LLM-generated vulnerability report is enormous — possibly wider than the gap between a skilled penetration tester and a script kiddie running Metasploit defaults.

The actionable question for security teams is whether these tools are accessible. Big Sleep remains a Google-internal research project. If Mythos is available for external use (or open-sourced), it's worth evaluating against your own codebase — particularly for C/C++ projects with complex state machines, protocol parsers, or memory management patterns that traditional fuzzers handle poorly.

For the broader ecosystem, this is also a signal about where CI/CD security pipelines are heading. Today's best practice is running SAST tools, dependency scanners, and maybe OSS-Fuzz integration. Within 18 months, expect AI-assisted vulnerability discovery to become a standard CI/CD stage — not replacing fuzzers, but running alongside them as a complementary analysis layer. The economics make sense: the cost of an AI analysis pass is dropping fast, and the cost of a missed vulnerability in infrastructure software like curl is measured in billions of affected devices.

For maintainers still buried in AI-generated report spam, the uncomfortable reality is that you can't blanket-reject AI-sourced reports anymore. The wheat exists among the chaff. The better filter isn't "was AI involved?" but "does this report demonstrate understanding of my codebase's actual behavior?" — which, ironically, is the same filter you'd apply to human-submitted reports.

Looking Ahead

We're watching the transition from AI-as-noise to AI-as-tool in security, and curl just became the benchmark. If the most scrutinized, most fuzzed, most skeptically maintained open-source project in the world acknowledges that AI found something real, the conversation shifts. The question is no longer whether AI can find real vulnerabilities — it's how quickly the tooling matures past the current state where 99% of AI security output is still garbage and 1% is genuinely impressive. For practitioners, the move is to track which tools are in that 1%, and ignore the rest with extreme prejudice.

Hacker News 631 pts 260 comments

Mythos Finds a Curl Vulnerability

→ read on Hacker News

// share this

// get daily digest

Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.