An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem Wit...

What happened

An amateur mathematician — someone without a professional academic position in mathematics — has solved Erdős problem #1196, a combinatorics question that has stood open for approximately 60 years. The tool that helped bridge the gap between enthusiast and published result: ChatGPT.

The problem, catalogued on erdosproblems.com, is one of hundreds posed by the legendary Hungarian mathematician Paul Erdős, who spent decades scattering unsolved problems across the mathematical landscape like seeds. Many remain open. Some carry cash bounties. All carry prestige. Solving any Erdős problem is notable; solving one without institutional affiliation, using an LLM as a collaborator, is unprecedented.

The approach has been dubbed "vibe maths" — a riff on Andrej Karpathy's "vibe coding" meme — where the human mathematician used ChatGPT not as an oracle that produces finished proofs, but as a high-bandwidth thinking partner. The LLM suggested directions to explore, checked intermediate steps, and helped the solver iterate on partial arguments far faster than they could alone. The final proof, however, required the human to exercise mathematical judgment that the model itself could not.

Why it matters

Let's get the obvious objection out of the way: ChatGPT did not solve this problem. A human solved it, with ChatGPT as an accelerant. But that distinction, while technically correct, undersells what actually happened here.

The significant finding isn't that LLMs can do mathematics — it's that LLMs can make *amateurs* productive in domains that previously required years of specialized training to even attempt. This is the "bicycle for the mind" thesis made concrete in one of the most demanding intellectual disciplines that exists.

The Hacker News discussion (score: 190 and climbing) has split predictably into camps. One side argues this validates the AI-as-collaborator model: the human provided taste, direction, and verification while the AI provided breadth, speed, and pattern-matching across a vast corpus of mathematical techniques. The other side worries about what "vibe maths" means for rigor — if the solver doesn't fully understand every step the LLM suggested, is the proof actually trustworthy?

The answer, at least in this case, appears to be yes. Professional mathematicians have reviewed the work. The proof stands on its own merits regardless of how the ideas were generated. Mathematics doesn't care about provenance — a correct proof is correct whether it was conceived in a shower, on a napkin, or in a ChatGPT conversation.

But the meta-question is more interesting: what does it mean when the barrier to entry for serious mathematical research drops from "PhD plus years of specialization" to "deep interest plus an LLM subscription"? We've seen analogous shifts in software engineering (GitHub Copilot turning junior devs into mid-level contributors on unfamiliar codebases), in writing (LLMs helping non-native speakers produce polished English), and in legal research (AI tools letting small firms compete with BigLaw on document review). Mathematics was supposed to be different — too abstract, too rigorous, too dependent on deep structural intuition.

Apparently not.

The 'vibe maths' method, unpacked

The term "vibe maths" is deliberately provocative, but the underlying workflow is recognizable to anyone who's used LLMs productively for technical work. It follows a pattern:

1. Human frames the problem. The solver identified which Erdős problem to attempt, understood the existing literature, and knew what a solution would need to look like.

2. LLM generates candidate approaches. ChatGPT suggested proof strategies, relevant theorems, and structural ideas — many wrong, some interesting, a few genuinely useful.

3. Human filters and steers. The solver evaluated which directions were promising based on mathematical intuition the model lacks. This is the "vibe" part — pattern-matching on what *feels* right before you can prove it.

4. Iterate rapidly. The feedback loop between human judgment and LLM generation compressed what might have been months of solo exploration into a much shorter timeline.

5. Human writes the final proof. The published result is a standard mathematical proof that stands independent of the process that generated it.

This workflow maps almost exactly to how productive developers use Copilot or Claude for complex engineering tasks: you need to know what good looks like, but the AI helps you get there faster. The people who dismiss this as "the AI did it" misunderstand the process. The people who dismiss it as "the human did everything" ignore the counterfactual — this particular human likely would not have solved this particular problem without the tool.

What this means for your stack

If you're building AI-assisted developer tools, this is a case study worth studying closely. The key architectural insight is that the LLM's value wasn't in producing a correct final output — it was in compressing the exploration phase. The solver explored more dead ends faster, which meant finding the right path sooner.

This has direct implications for how we design AI coding tools. The current generation of AI assistants is optimized for code generation — give me a function that does X. But the vibe maths result suggests the higher-value application might be exploration assistance: help me understand which architectural approach is worth pursuing before I commit to building it. Help me enumerate the failure modes I haven't considered. Help me find the relevant prior art I don't know to search for.

For teams evaluating AI tool ROI: stop measuring lines of code generated and start measuring time-to-good-decision. The amateur mathematician didn't need ChatGPT to write the proof — they needed it to figure out which proof to write.

There's also a talent-market implication. If domain amateurs with AI tools can now compete with specialists on well-defined open problems, the moat around deep specialization narrows. This doesn't mean expertise becomes worthless — the solver still needed substantial mathematical knowledge to frame the problem and evaluate outputs. But it does mean the minimum viable expertise for serious contributions is dropping.

Looking ahead

We're early in understanding what "AI-assisted research" actually looks like in practice, as opposed to the breathless predictions and dismissive counter-takes that dominate the discourse. This Erdős result is a single data point, but it's a striking one. The next question isn't whether LLMs can help amateurs solve hard problems — we now have proof they can. The question is whether this scales: can the vibe maths approach work on problems where verification is harder than generation, where you can't easily check if the AI's suggestions led you astray? In mathematics, a proof is a proof. In engineering, a system that works in testing can still fail in production. The gap between "LLM helped me find an answer" and "LLM helped me find the *right* answer" remains the central unsolved problem of AI-assisted work.

An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem With ChatGPT

// tldr

// viewpoints

// deep dive

What happened

Why it matters

The 'vibe maths' method, unpacked

What this means for your stack

Looking ahead

// read from source

Amateur armed with ChatGPT solves an Erdős problem

// community takes

An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem With ChatGPT

// tldr

// viewpoints

// deep dive

What happened

Why it matters

The 'vibe maths' method, unpacked

What this means for your stack

Looking ahead

// read from source

Amateur armed with ChatGPT solves an Erdős problem

// community takes

// share this