Collective Mathematicians Challenge OpenAIs Math Genius in Revolutionary Experiment

In Berkeley, California, a private event was held featuring 30 of the world’s leading mathematicians. The purpose of the gathering was to test a new creation from OpenAI — the chat-bot o4-mini, which is capable of complex logical reasoning.

The experiment was organized by the non-profit organization Epoch AI, which focuses on testing and comparing large language models. The results of the tests were later reported by Scientific American. The AI model demonstrated its ability to tackle some of the world’s most challenging problems, leading participants at the mathematical symposium to equate it with a «mathematical genius.»

The o4-mini chatbot is marketed as the most cost-effective option among OpenAI’s smaller models, aimed at significantly expanding the range of AI-based applications due to its affordable pricing. To assess the AI’s capabilities, a special set of 300 unique mathematical problems of varying difficulty — from undergraduate to research level, with no previously published solutions — was created. Furthermore, researchers were prohibited from discussing anything with each other via regular messaging platforms or email to maintain the experiment’s integrity.

Such problems would typically baffle traditional models, yet in its preparation for the meeting, o4-mini already demonstrated impressive results by solving approximately 20% of the issues. The symposium participants were tasked with formulating the final ten questions, which posed a genuine challenge even for seasoned academics. Only a handful of individuals worldwide would be able to devise and solve them. A reward of $7,500 was offered for each problem the AI could not solve.

Mathematician Ken Ono, who led and judged the gathering, revealed that the AI was presented with a problem from number theory, akin to a doctoral dissertation level. To his astonishment, the chatbot began searching for a solution in real-time. Initially, it reviewed the literature on the topic, then attempted to solve a simplified version of the problem, ultimately presenting a bold but correct solution.

Eventually, the group succeeded in finding ten questions that provided a real test for o4-mini, but the scientists were amazed by how far AI had advanced in just a year. The bot also outpaced professional mathematicians, requiring only a few minutes to accomplish what a human expert would take weeks or months to do.

Additionally, I would like to recommend [BotHub](https://bothub.chat/?utm_source=contentmarketing&utm_medium=habr&utm_campaign=news&utm_content=COLLECTIVE_SCIENTIST_VS_MATHEMATICAL_GENIUS_FROM_OPENAI) — a platform where you can test all popular models without restrictions. No VPN is required for access, and you can use a Russian card. [Through this link](https://bothub.chat/?invitedBy=m_aGCkuyTgqllHCK0dUc7), you can obtain 100,000 free tokens for your initial tasks and start working right away!

[Source](https://digitalcryptography.ru/news/novosti-otrasli/kollektivnyy-uchyenyy-protiv-matematicheskogo-geniya-ot-openai/)