The latest models of OpenAI artificial intelligence are more illusory • DigiKala Meg

Openai (creator of GPT chat) unveiled its artificial ielligence models shortly before the new generation of its artificial ielligence models. These models, known as O3 and O4-Mini, have made significa progress over previous versions, according to their creators. However, new reports have raised concerns about the accuracy of these new models. It seems that the phenomenon of “illusion” or the provision of inaccurate information as a reality is still a serious issue in these new models of the soul and may have become even more bold.

According to information released by Singhchch, the O3 and O4-Mini models appear to be more susceptible to producing unrealistic coe than expected. Openai’s own iernal tests also confirm this. The results of these experimes show that the incidence of hallucinations in O3 and O4-MINI is not only higher than older reasoning models such as O1, O1-MINI and O3-Mini, but also exceeds standard and widely used Openai models like Gpt-4O. These findings are partly surprising, as it is usually expected to decrease in such errors as artificial ielligence models progress.

Openai

The phenomenon of illusion in artificial ielligence is one of the main obstacles to the developme of this technology. Overcoming this problem is not an easy task and requires complex approaches. Although in many cases, newer generations of models succeed in overcoming this problem and show more precision than their previous versions, this process seems to have been in reverse for O3 and O4-Mini. This raises importa questions about the developme of these models and the challenges ahead.

The thing that doubles concerns is that Openai itself has no clear reason for this increase in the illusion in its new models. In a technical report on O3 and O4-Mini, the company has explicitly stated that further research is needed to understand why the illusion increases as the reasoning improves. This uncertaiy shows that the complete understanding of the iernal mechanisms of these complex models is still a major challenge for researchers in the field.

Of course, the advances of these models should not be ignored. Reports suggest that O3 and O4-Mini in some areas, especially tasks related to programming and mathematical problems, perform better than before. However, it seems that this performance improveme has been associated with one cost. According to Openai’s analysis, these models generally “make more claims”. This increase in the number of claims includes both more accurate information and, unfortunately, increases inaccuracies.

Openai

To better understand the scale of this problem, Openai refers to the results of his iernal benchmark called Personqa. This benchmark is designed to measure the accuracy of the model in providing information about people. The results show that the O3 model in 33 % of cases has been illuminated when answering the benchmark questions and providing inaccurate information. This is almost twice as much as the illusion of previous argumes, O1 (16 %) and O3-Mini (14.8 %). The situation for the O4-Mini model seems even more worrying, as it has an illusion in 48 % of cases in the Personqa benchmark.

It can be said that hallucinations can sometimes help artificial ielligence models come to new and creative ideas, but this feature is a big poi for commercial applications and situations where information accuracy is a top priority. Businesses and users who need reliable and accurate outputs of artificial ielligence cannot simply pass this error. One of the promising ways to reduce hallucinations and increase accuracy is to equip models with web search capability. This feature allows the model to verify its information with external sources. For example, the Gpt-4O, which uses web search capability, has achieved a significa 90 % rating at the Simpleqa benchmark (another criterion of accuracy measuring). This shows that up -to -date and up -to -date information can play an importa role in reducing hallucinations. However, the main challenge for the new O3 and O4-MINI models remains in place and will require further investigation by Openai.

Source: Techcrunch

RCO NEWS

New ways to get Canadian permanent residence through Express Entry 2026

Get to know Ryazan University in Russia! Complete guide for 2026 study applicants

ca PGWP golden tips that most Canadian students don’t know

ca

A detailed comparison of Russia and China for education and immigration, an analytical and realistic guide to the decision that will shape your future

Conditions for buying bus tickets Booking guide and bus travel rules

Introduction of the silver beach of Hormuz (access route + accommodation)

Al Habtoor Palace Dubai Hotel

Traffic police: Chalus road, Tehran freeway to the north and Pardis became one-way

Swissôtel Al Ghurair, Dubai

ChatGPT’s safety rules need to be revised

Ethereum time bomb at the border of 2 dollars and the possibility of a historic explosion!

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

The latest models of OpenAI artificial intelligence are more illusory • DigiKala Meg

Leave a Reply Cancel reply

Editor's Pick

Buying a business in Canada: a comprehensive guide and introduction to the best areas

Dubai Metro Map 2024 from introduction to (new download)

Burj Al Arab restaurants Instant booking 2024

Top Writers

Oponion

Women’s short home cotton shirt

You Might Also Like

ChatGPT’s safety rules need to be revised

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Other News

Technology

Immigration

Travel

More

Subscribe