New language model Openai That for Reasoning And Solve complicated problems Designed, succeeded in gaining performance at the level Gold Medal of the World Mathematics Olympiad (IMO) has been 2025.
According to OpenAI researcher Alexander Wei, the model has been tested in similar human participants, two 4.5 -hour sessions without access to tools, internet or foreign resources. The Openai model has achieved this great achievement by reading the official explanation of issues and providing proofs in natural language.
He explains that this success is important in several respects. First, IMO issues require creative thinking and constant reasoning over a long period of time. According to him, the progress of linguistic models in math understanding of simple problems such as the GSM8K has begun with a solving time under one minute, and now IMO has reached 100 minutes.
Second, IMO descriptions are multi -page and more difficult to evaluate, and therefore, crossing the traditional frameworks of reinforcement learning to achieve accurate human reasoning is a significant achievement. An example of the questions in this test is below.
Openai model performance at the Math Olympiad
According to Openai, the new language model has been able to solve 5 out of 6 2025 Olympiad issues and a total of 35 out of 42 points. He claims that this score is equivalent to the gold medal. Also, each model response is reviewed by the three former IMO medalists independently and the final score is determined by their complete consensus.
The new OpenAI model is currently a test sample and is not expected to be released with this level of ability to solve mathematical problems over the next few months. However, he has emphasized that this success shows the high speed of artificial intelligence in recent years.
Artificial intelligence has made rapid progress in areas such as programming and mathematics. Just a few days ago, one of the Openai models won the second place in the Atcoder Programming Competition and ranked above all human beings (except one person). Also in recent weeks, the Grok-4 Heavy model has been able to get a full 100 score in the Aime 25 math test. Now, with the success of Openai in IMO, it seems that there is a lot of time left to surpass human intelligence in areas such as mathematics and programming.
RCO NEWS




