A weekend in mid-May, a clandestine mathematical conclave has summoned. Thirty of the most renowned mathematicians in the world went to Berkeley, California, some coming as far as the United Kingdom to which the members of the group faced in a confrontation with A “reasoning” chatbot who was responsible for solving problems they had designed to test his mathematical courage. After asking the professor’s level questions for two days, the researchers were amazed to discover that he was able to answer a part of the Most difficult resolution problems in the world. “I have colleagues who literally said that these models were approaching mathematical genius,” said Ken Ono, a mathematician at the University of Virginia and chief and judge in Reunion.
The chatbot in question is powered by O4-minA so-called great tongue reasoning (LLM). It was formed by Openai to be able to make very complex deductions. The equivalent of Google, GEMES 2.5 FLASHhas similar capacities. Like the LLMs which fuel the anterior versions of Chatgpt, O4-Mini learns to predict the following word in a sequence. Compared to these previous LLMs, however, O4-Mini and its equivalents are lighter and more agile models that train on specialized data sets with stronger strengthening on the part of humans. The approach leads to a chatbot capable of diving much more into complex problems in mathematics than Traditional LLMS.
To follow the progress of O4-Mini, OPENAI before Epoch ai loaded, a non -profit organization that benchmarks LLMS, to ask 300 mathematical questions whose solutions had not yet been published. Even traditional LLM can correctly answer many complicated mathematical questions. However, when Epoch Ai asked several of these models of this type, these questions, which were different from those on which they had been trained, the most successful were able to resolve Less than 2%showing these LLMS did not have the capacity to reason. But O4-Mini would prove to be very different.
Epoch ai hired Elliot Glazer, who had recently completed his ph.d. in mathematics, to join the new collaboration for the reference, nicknamed FrontierhathIn September 2024. The project collected new questions on different levels of difficulty, the first three levels covering undergraduate challenges, graduates and research. In April 2025, Glazer noted that O4-Mini could resolve around 20% of the questions. He then went to a fourth level: a set of questions that would be difficult even for an academic mathematician. Only a small group of people around the world would be able to develop such questions, not to mention answering them. The mathematicians who participated had to sign a non-disclosure agreement forcing them to communicate only via the signal of the messaging application. Other forms of contact, such as traditional email, could potentially be scanned by an LLM and inadvertently train it, thus contaminating the set of data.
Each problem that O4-Mini could not solve could arouse the mathematician who found a reward of $ 7,500. The group has made slow and regular progress in the search for questions. But Glazer wanted to accelerate things, so Epoch Ai welcomed the meeting in person on Saturday May 17 and Sunday May 18. There, the participants would finish the last share of challenges of challenge. The 30 participants were divided into groups of six. For two days, academics competed against themselves to arouse problems that they could solve, but would stumble the reasoning bot in AI.
At the end of this Saturday evening, Ono was frustrated by the bot, whose unexpected mathematical prowess thwarted the group’s progress. “I have encountered a problem that experts in my field would recognize as an open question in the theory of numbers – a good doctoral problem,” he said. He asked O4-Mini to resolve the issue. Over the next 10 minutes, Ono looked in an amazed silence while the bot deployed a real -time solution, showing its reasoning process along the way. The bot spent the first two minutes to find and master the related literature in the field. Then he wrote on the screen that he wanted to try to resolve a simpler “toy” version to learn. A few minutes later, he wrote that he was finally ready to solve the most difficult problem. Five minutes later, O4-Mini presented a correct but sassy solution. “It was starting to be really cheeky,” said Ono, who is also an independent mathematical consultant for Epoch Ai. “And at the end, he said:” No necessary quote because the mystery number was calculated by me! “”
In relation: The AI comparative analysis platform helps the best companies
Defeated, Ono jumped on the signal early this Sunday morning and alerted the rest of the participants. “I was not ready to face an LLM like this,” he says, “I have never seen this kind of reasoning before in models. This is a scientist. It is frightening.”
Although the group finally managed to find 10 questions that hampered the bot, the researchers were surprised by the extent to which AI had progressed in the space of a year. Ono compared it to work with a “strong collaborator”. Yang Huhe He, mathematician at the London Institute for Mathematical Sciences and a first pioneer of the use of AI in mathematics, says: “This is a very, very good graduate student would do – actually.”
The bot was also much faster than a professional mathematician, taking only a few minutes to do what it would take such a week or months of human experts.
While the fight with O4-Mini was exciting, his progress was also alarming. Ono and he express their concern that the results of the O4-Mini could trust too much. “There is evidence by induction, proof by contradiction, then proof by intimidation,” he said. “If you say something with enough authority, people are just afraid. I think O4-Mini has mastered the proof by intimidation; that said everything with so much confidence.”
At the end of the meeting, the group began to consider what the future might look like for mathematicians. The discussions turned to the inevitable “level five” – questions that even the best mathematicians could not resolve. If AI reaches this level, the role of mathematicians would undergo a clear change. For example, mathematicians can move simply to ask questions and interact with reasoning to help them discover new mathematical truths, just like a teacher with graduate students. As such, ono predicts that nourishing creativity in higher education will be a key to maintaining mathematics for future generations.
“I told my colleagues that it is a serious mistake to say that generalized artificial intelligence will never come, [that] It’s just a computer, “said Ono.” I do not want to add to hysteria, but in some respects, these major language models already surpasses most of our best students in the world. “”
This article was published for the first time at American scientist. © Scienticamer.com. All rights reserved. Follow Tiktok and Instagram,, X And Facebook.