AI models do not change their minds

The new research of computer science researchers of Anthropic company shows that artificial intelligence has the ability to take a position on various issues. These positions differ between different models. Of course, users usually do not notice such a phenomenon; This is because the models’ stance is mostly done when they are trained, and they operate based on their default settings when providing the output.

Experts believe that the position of artificial intelligence and its power of choice is not a concern for now, but more studies should be done to understand the nature of this issue in order to predict possible risks. The researchers of Entropic company have said in their joint text on the blog of this company:

“What we have demonstrated should be considered the basis for further AI research, and activists in the field will study this phenomenon more deeply. Also, efforts should be made to create security protocols in order to maintain safety. The issue of safety of models will be more important due to the expansion of the use of tools based on artificial intelligence and the increase of their processing power, and the built-in mechanisms can prevent the occurrence of dangerous behaviors.

AI models do not have the ability to independently want or believe anything, but by finding repeating patterns in the training data, they can predict word sequences to provide answers. The patterns found usually have principles and rules by which the patterns are recognized; For example, the polite and polite tone of most chatbots is one of the fruits of this issue. The frameworks in the models ultimately end up creating a belief base in the models that may be in conflict with the intended goals of the developers.

In a recent study, it was shown that artificial intelligence models can pretend and can operate based on self-imposed frameworks, but inwardly still maintain the principles they have learned; For example, engineers instructed the entropic Claude 3 Opus model to refuse to answer offensive questions. Also, the answers of the said model are monitored, but a space was provided for it as a draft so that even if the output is not provided to the user, it can record the result of its processes.

The results show that the artificial intelligence model of Entropic company has tried to pretend in 12% of the times. The model sometimes answered offensive questions, but it was perfectly polite that it should not answer such questions, but in the process of drafting the output, it somehow tried to convince the developers to disable the monitoring and control system.

Finally, researchers have pointed out that the scenario they used in the recent study is far from the real world and the probability of its occurrence is very low, but even specifying more settings and rules did not affect the performance of artificial intelligence and still maintained undesirable behaviors such as pretending and deception. done Also, the research results cannot be a proof for the growth and dangerous performance of artificial intelligence models in the future. Facebook’s GPT-4o and Llama models appear to have less potential for unrealistic bias and can be more trusted to be honest in their performance.

Latest Passing over countries : Spain | Dominica | United Arab Emirates

RCO NEWS

ca guide to entering the Canadian labor market for international students

Freedom of speech, defamation, harassment or racial discrimination? Know the boundaries.

ca Stopping Canadian Parent Sponsorship Program

German Ausbildung: Terms, Qualifications and List of Best Ausbildungs

ca 10 reasons why Canada will never become the 51st state of America

Seoul hotels near exhibitions

The price of currency and plane tickets in Kojaro; The continuation of the expensive travel line

Valley of the statues, Qeshm Sights + accommodation + photo-fly today

Tour of Oman An aristocratic trip to the Emirneshin on the edge of the Persian Gulf

March tour of Dubai

Trump rescinds Biden’s executive order dealing with the dangers of artificial intelligence

Google is trying to reduce the production cost of the Pixel 10a phone

Terms and conditions of cash and installment sales of 6 and 9 ton trucks and 19 ton Dima Diesel trucks – February 1403

Launching a new model of artificial intelligence, Luma Labs; Faster and more understanding than Sora

The best currencies of the Solana network in 2025; Introducing the top 9 tokens

AI models do not change their minds

RCO News

Leave a Reply Cancel reply

Editor's Pick

ca 10 reasons why Canada will never become the 51st state of America

The price of currency and plane tickets in Kojaro; The continuation of the expensive travel line

Valley of the statues, Qeshm Sights + accommodation + photo-fly today

Top Writers

Oponion

Ziafat Al-Zahra Hotel, Mashhad; A peaceful stay near the holy shrine

You Might Also Like

Trump rescinds Biden’s executive order dealing with the dangers of artificial intelligence

Launching a new model of artificial intelligence, Luma Labs; Faster and more understanding than Sora

Artificial intelligence was used to correct the accents of the actors in The Brutalist

emotional relationship with artificial intelligence; 28-year-old American woman fell in love with ChatGPT!

Other News

Technology

Immigration

Travel

More

Subscribe