An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem Wit...

What happened

An amateur mathematician — someone without a university position or formal research affiliation — used OpenAI's ChatGPT to solve Erdős Problem #1196, a conjecture from the legendary Hungarian mathematician Paul Erdős that had remained open for approximately 60 years. The result was reported by Scientific American under the headline framing of "vibe maths," a deliberate callback to the now-ubiquitous "vibe coding" phenomenon.

Erdős problems carry a particular weight in mathematics. Paul Erdős, who died in 1996, posed hundreds of open problems throughout his career, many accompanied by cash bounties. The problems cataloged at erdosproblems.com range from elementary-sounding number theory questions to deep combinatorial conjectures. Problem #1196 falls in the combinatorics and number theory space — the kind of problem where the statement is deceptively simple but the proof requires genuine structural insight.

The solver didn't just paste the problem into ChatGPT and receive a proof. They used the LLM as an iterative thinking partner — bouncing proof strategies off it, asking it to check logical steps, exploring dead ends faster than they could alone, and using the model's pattern-matching to suggest approaches from adjacent areas of mathematics. The process reportedly involved many sessions and significant human direction.

Why it matters

The phrase "vibe maths" is doing real work here, and it's worth unpacking why. In software, vibe coding describes the practice of using LLMs to write code by intent rather than by hand — you describe what you want, the model generates it, you iterate. The results range from surprisingly good to subtly catastrophic, depending on the complexity and the operator's ability to evaluate output.

Mathematics is, in some ways, a harder domain for this workflow and in other ways an easier one. Harder because mathematical proofs demand absolute rigor — there's no "it works on my machine" escape hatch. A proof is correct or it isn't. But easier because correctness is verifiable in a way that software behavior often isn't. You can check a proof. You can formalize it. The LLM doesn't need to be right on the first try; it needs to help you search the space of possible arguments faster.

This result lands in the middle of an ongoing debate about AI's role in mathematical research. On one side, projects like DeepMind's AlphaProof and Meta's formal theorem-proving work aim to automate proof discovery end-to-end. Those systems target the Fields Medal frontier — IMO problems, millennium-prize-adjacent conjectures. On the other side, working mathematicians have quietly been using ChatGPT and Claude as sophisticated rubber ducks: tools for checking intuitions, generating counterexamples, and exploring unfamiliar subfields.

The Erdős result suggests the second approach may be underrated. The solver didn't need a purpose-built theorem prover. They needed a general-purpose language model that could hold context about a mathematical argument and respond usefully to natural-language queries about proof strategies. That's a much lower bar than autonomous proof discovery, and it's available to anyone with a browser right now.

The Hacker News discussion (285 points) reflected a community genuinely engaged with the implications. The predictable objections surfaced — "the human did the real work," "ChatGPT just got lucky," "wait for peer review" — but the dominant thread was more nuanced. Several commenters with mathematical backgrounds noted that the bottleneck in solving many open problems isn't raw intelligence but exposure: knowing which techniques from which subfields might apply. LLMs, trained on the entire mathematical literature, serve as a kind of compressed library that can surface connections a lone researcher might never encounter.

What this means for your stack

If you're a developer who uses AI tools daily, this story validates something you probably already suspect: the value of LLMs as thinking partners scales with the difficulty and openness of the problem, not just with the volume of boilerplate to generate.

The practical implications extend beyond mathematics. Consider the parallel to debugging complex distributed systems, designing novel algorithms, or reasoning about security threat models. These are all domains where the bottleneck is often navigating a vast search space of possible approaches, not executing the final solution. The Erdős result demonstrates that a general-purpose LLM, used by someone with strong domain intuition, can meaningfully compress that search.

There's a workforce implication too, and it cuts both ways. The amateur framing is significant — this wasn't a tenured professor at a research university. It was someone outside the institutional system who lacked access to a department full of collaborators but found a substitute in AI. For developers and technically-minded people who've always wanted to contribute to adjacent fields — mathematics, physics, biology — the barrier to meaningful participation just got measurably lower. You still need the intuition. You still need the ability to evaluate whether an AI-suggested approach is nonsense. But you no longer need to be embedded in an institution to have access to a knowledgeable sounding board.

The flip side: if an amateur with ChatGPT can solve problems that stumped professionals for decades, what does that say about the problems? Some mathematicians will argue (not unreasonably) that Erdős problems vary enormously in difficulty, and that a 60-year-old unsolved problem isn't necessarily a hard problem — it may just be an overlooked one. The real test will be whether AI-assisted amateurs start cracking problems that active research groups have been grinding on. That hasn't happened yet.

Looking ahead

The "vibe maths" label will stick, for better or worse. Expect more stories like this as the population of people using LLMs for serious intellectual work outside software engineering grows. The more interesting question isn't whether AI can help solve math problems — at this point, clearly yes — but whether the verification pipeline can keep up. Peer review in mathematics already takes months to years. If AI-assisted solvers start submitting proofs at higher volume, the bottleneck shifts from discovery to validation. That's a good problem to have, but it's a real one. For now, the takeaway is simple: the most productive use of LLMs isn't replacing human thinking. It's making human thinking cheaper to iterate on. An amateur just proved that with a 60-year-old conjecture and a ChatGPT subscription.

An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem With ChatGPT

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

Amateur armed with ChatGPT solves an Erdős problem

// community takes

An Amateur 'Vibe Mathed' a 60-Year-Old Erdős Problem With ChatGPT

// tldr

// viewpoints

// deep dive

What happened

Why it matters

What this means for your stack

Looking ahead

// read from source

Amateur armed with ChatGPT solves an Erdős problem

// community takes

// share this