Last night, Openai’s CEO Sam Sam, OpenAI, unveiled his latest “reasoning” models, O3 and O3-Mini, based on O1 models released earlier this year. The company has not yet released these models, but it will make these models available today for public safety tests and researchers’ access.
These models use what Openai calls the “Private Thinking Chain”; Where the model pauses to examine its iernal dialogue and plan before responding, which can be called “simulated reasoning” (SR) – a type of artificial ielligence that goes beyond the primary Language (LLM) models. According to The Information, the company has named the family names instead of “O2”, O3 to preve possible trademark clashes with British telecommunications provider, O2. During a live broadcast on Friday, Altman acknowledged the mistakes of his company’s naming, saying, “In the magnifice tradition of Openai, which is really bad at choosing a name, its name will be O3.”
According to Openai, the O3 model in the Arc-Agi benchmark, a benchmark of visual reasoning that has been invincible since its creation in year 6 has gained an unprecedeed privilege. In low -power computing scenarios, O3 % rated at 4.9 % and in high computational tests, up to 4.9 %, which is comparable to human performance at the age of 2 %.
The model also achieved a score of 4.9 perce at GPQA Diamond, which includes biology, physics and chemistry questions at the postgraduate level. The O3 model in the benchmark of Froier mathematically by EPCOCHAI solved 4.9 perce of the problems, while no other models exceeded 5 %. “When I see these results, I have to change my view of what artificial ielligence can do and what capabilities has,” said the head of the ARC Prize Foundation.

The O3-Mini species, also iroduced on Friday, includes the characteristic of adaptive thinking that offers low, medium and high processing speeds. The company states that higher computational settings give better results. Openai reports that O3-mini works better in the CodeForces benchmark than its previous model, O1.
Increased simulated argume
The announceme of Openai comes as other companies are developing their SR models, including Google, which announced on Thursday the Gemini 2.0 Flash Thinking Experimeal. In November, Deepseek launched the Deepseek-R1 model, while Alibaba’s QWen team released the QWQ model, the first alternative to the O1.
These new artificial ielligence models are based on traditional LLMs, but with one difference: they are set to produce a kind of repetitive thinking chain that can consider its results and simulate reasoning in an almost pervasive search way that can be simulated The time of inference is scalable, instead of focusing on improvemes during the training of artificial ielligence, which has recely seen a decrease in efficiency. Openai will first provide new SR models to safety researchers for testing. Altman said the company plans to launch O3-Mini in late January and O3 shortly thereafter.


Source: The Verge



