Last night, Openai’s CEO Sam Sam, OpenAI, unveiled his latest “reasoning” models, O3 and O3-Mini, based on O1 models released earlier this year. The company has not yet released these models, but it will make these models available today for public safety tests and researchers’ access.
These models use what Openai calls the “Private Thinking Chain”; Where the model pauses to examine its internal dialogue and plan before responding, which can be called “simulated reasoning” (SR) – a type of artificial intelligence that goes beyond the primary Language (LLM) models. According to The Information, the company has named the family names instead of “O2”, O3 to prevent possible trademark clashes with British telecommunications provider, O2. During a live broadcast on Friday, Altman acknowledged the mistakes of his company’s naming, saying, “In the magnificent tradition of Openai, which is really bad at choosing a name, its name will be O3.”
According to Openai, the O3 model in the Arc-Agi benchmark, a benchmark of visual reasoning that has been invincible since its creation in year 6 has gained an unprecedented privilege. In low -power computing scenarios, O3 % rated at 4.9 % and in high computational tests, up to 4.9 %, which is comparable to human performance at the age of 2 %.
The model also achieved a score of 4.9 percent at GPQA Diamond, which includes biology, physics and chemistry questions at the postgraduate level. The O3 model in the benchmark of Frontier mathematically by EPCOCHAI solved 4.9 percent of the problems, while no other models exceeded 5 %. “When I see these results, I have to change my view of what artificial intelligence can do and what capabilities has,” said the head of the ARC Prize Foundation.
The O3-Mini species, also introduced on Friday, includes the characteristic of adaptive thinking that offers low, medium and high processing speeds. The company states that higher computational settings give better results. Openai reports that O3-mini works better in the CodeForces benchmark than its previous model, O1.
Increased simulated argument
The announcement of Openai comes as other companies are developing their SR models, including Google, which announced on Thursday the Gemini 2.0 Flash Thinking Experimental. In November, Deepseek launched the Deepseek-R1 model, while Alibaba’s QWen team released the QWQ model, the first alternative to the O1.
These new artificial intelligence models are based on traditional LLMs, but with one difference: they are set to produce a kind of repetitive thinking chain that can consider its results and simulate reasoning in an almost pervasive search way that can be simulated The time of inference is scalable, instead of focusing on improvements during the training of artificial intelligence, which has recently seen a decrease in efficiency. Openai will first provide new SR models to safety researchers for testing. Altman said the company plans to launch O3-Mini in late January and O3 shortly thereafter.
Source: The Verge
RCO NEWS