ChatGPT’s accuracy in answering a simple math problem has gone from 98% to 2%

In a new study, Stanford University researchers have found that the June version of the popular artificial ielligence chatbot ChatGPT Compared to the March version Poor performance He has had some duties.

In their study, the scieists compared the performance of a chatbot built by OpenAI over several mohs on four “various” tasks: solving mathematical problems, answering sensitive questions, generating software code, and visual reasoning. Also in this study, two versions of OpenAI artificial ielligence technology viz GPT-3.5 And GPT-4 They have been reviewed during differe periods of time.

Accuracy difference of differe versions of ChatGPT

The most remarkable result stated by them is probably the capability of GPT-4 model in solving Mathematical problems It is related that in 97.6 perce From the questions of March, he correctly ideified 17077 as a prime number. But only three mohs later, its precision to 2.4 perce decreased!

In corast, GPT-3.5 has practically gone the opposite way. Although its March version is only in 7.4 perce Many times he answered these questions correctly, but in June he managed to increase the accuracy of his answers to 86.8 perce Increase.

There were similar results when researchers asked the models to write code or perform a visual reasoning test (predicting the next shape of a pattern).

The very differe result observed from March to June from the OpenAI artificial ielligence model shows the unpredictable effects of changes to one part of the model. Stanford computer science professor James Zou, who is also one of the authors of this study, explains in this regard:

“When we set out to improve the performance of a large language model on some specific tasks, there can be many uniended consequences that may actually undermine its performance on other tasks. “There are differe kinds of ierdependencies in how the model answers questions that can lead to worse behavior than we’ve seen so far.”

RCO NEWS

New ways to get Canadian permanent residence through Express Entry 2026

Get to know Ryazan University in Russia! Complete guide for 2026 study applicants

ca PGWP golden tips that most Canadian students don’t know

ca

A detailed comparison of Russia and China for education and immigration, an analytical and realistic guide to the decision that will shape your future

Conditions for buying bus tickets Booking guide and bus travel rules

Introduction of the silver beach of Hormuz (access route + accommodation)

Al Habtoor Palace Dubai Hotel

Traffic police: Chalus road, Tehran freeway to the north and Pardis became one-way

Swissôtel Al Ghurair, Dubai

ChatGPT’s safety rules need to be revised

Ethereum time bomb at the border of 2 dollars and the possibility of a historic explosion!

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

ChatGPT’s accuracy in answering a simple math problem has gone from 98% to 2%

Accuracy difference of differe versions of ChatGPT

Leave a Reply Cancel reply

Editor's Pick

Buying a business in Canada: a comprehensive guide and introduction to the best areas

Dubai Metro Map 2024 from introduction to (new download)

Burj Al Arab restaurants Instant booking 2024

Top Writers

Oponion

Women’s short home cotton shirt

You Might Also Like

ChatGPT’s safety rules need to be revised

New Qwen 3.5 open source models released; Suitable for running on personal systems

The Perplexity Computer platform was introduced

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Other News

Technology

Immigration

Travel

More

Subscribe