Did Grok-3 Really Prove the Riemann Hypothesis?
A weekend tweet by xAI researcher Hieu Pham caused a stir in the AI community:
“Grok-3 AI system just proved the Riemann hypothesis. We’ve paused training to verify the proof. If confirmed, we won’t resume training as such an AI would be considered too intelligent and potentially dangerous to humans.”
Let’s start with the conclusion: it was just a meme.
However, the tweet quickly garnered over 2 million views and discussions, spreading throughout global AI circles.
The origin traces back to Andrew Curran’s earlier “leak” claiming a catastrophic event during Grok-3’s training.
Bizarre rumors followed. Some joked that OpenAI CEO Sam Altman aimed a massive laser at xAI’s largest training cluster, causing severe data damage. Others suggested deliberate sabotage of next-gen LLM training.
Even Runway founder Cristóbal Valenzuela joined in:
“Gen-4 just won all Oscar categories including Best Picture. We’ve paused training to study its artistic innovations. If the film proves revolutionary as early critics suggest, we won’t resume training as this indicates AI has reached such heights in art that it might threaten human creativity.”
Several xAI researchers amplified the joke by retweeting Curran’s post.
xAI co-founder Greg Yang quipped that Grok-3 physically attacked an elderly security guard during training.
Researcher Heinrich Kuttler added: “Yes, it was terrible! We had to replace all the anomalous weights with NaN (Not a Number) to recover.”
More rational users tested current Grok’s understanding of the Riemann hypothesis, which proved predictably limited.
Finally, Hieu Pham ended the drama:
“Okay, SNL is over. For why proving the Riemann hypothesis is dangerous, I highly recommend @matthaig1’s brilliant novel ‘The Humans.'”
Why did this joke gain such traction? First, the importance of the Riemann hypothesis itself.
The Riemann hypothesis, proposed by German mathematician Bernhard Riemann in 1859, is one of the Clay Mathematics Institute’s Millennium Prize Problems.
It involves the Riemann zeta function: ζ(s)=1+1/2s+1/3s+1/4s+⋯
The hypothesis states that all non-trivial zeros of the Riemann zeta function have a real part equal to 1/2. The Clay Mathematics Institute offers a $1 million prize for its proof.
This hypothesis significantly impacts number theory. Many modern encryption technologies rely on prime number properties. Proving it would enhance understanding of these technologies and potentially affect future security algorithms.
If Grok-3 could prove the Riemann hypothesis, it would advance theoretical mathematics, physics, cryptography, and mark AI’s tremendous progress in reasoning and complex problem-solving.
Moon Labs founder Yang Zhilin suggests mathematics is ideal for developing AI’s thinking abilities.
Mathematics’ rigorous logical system aligns with AI’s reasoning capabilities. AI’s mathematical problem-solving involves continuous thinking, trying different approaches, and learning from errors through verification and correction.
This concept reflects in OpenAI’s o1 reinforcement learning training. While previous large models learned data, o1 learns thinking processes, similar to showing work in math problems.
In the AIME test for talented high school students, GPT-4o completed 13% of problems, while o1 achieved 83% accuracy.
For doctoral-level GPQA Diamond research evaluation, GPT-4o scored 56.1%, while o1 reached 78%, surpassing human doctors’ 69.7%.
In International Olympiad in Informatics (IOI) testing, the model achieved 49% (213 points) with 50 attempts per problem, increasing to 362 points with 10,000 attempts.
Similar to AlphaGo, which learned through reinforcement training, o1 generates and optimizes thinking chains through reward mechanisms.
Musk has repeatedly hyped Grok-3’s capabilities, claiming it will be “the world’s most powerful AI” when released year-end.
Grok-3, developed by xAI, is expected to surpass existing AI models, backed by the world’s largest AI training cluster – Colossus.
Colossus comprises 100,000 liquid-cooled NVIDIA H100 GPUs with a single RDMA network interconnect architecture, surpassing any existing supercomputer.
The Information reported that Colossus even caught Altman’s attention, who flew over its training facility to assess development progress and energy supply.
The combination of “strongest AI,” “millennium math problem,” and persistent “AI threat theory” created a perfect rumor storm.
The Grok-3 Riemann hypothesis rumor reflects the AI industry’s state:
- It reveals deep attitudes toward AI – optimists believe in AI’s ultimate capabilities while fearing both rapid advancement and insufficient progress.
- Since GPT-4’s release, despite new products, breakthrough innovations have been rare.
Each AI rumor reveals industry anxieties and expectations.
Recent discussions about Scaling Law limitations and “innovation fatigue” have made people impatient with incremental improvements compared to last year’s boom.
The Grok-3 rumor represents collective imagination about the future. Users increasingly await the next qualitative leap like GPT-3.5 to GPT-4.
Real AI breakthroughs often occur when least expected, but we hope to see progress by year-end.