Sign In
  • U.S.
  • Canada
  • Australia
  • Dubai
  • UAE
  • Dominica
  • Spain
Rco News Light Rco News Logo Dark
  • Home
  • Immigration
    New ways to get Canadian permanent residence through Express Entry 2026
    Canada

    Federal governme Canada Officially announced that from 2026 some new job categories in the system Express Ery It prioritizes selection…

    1 Min Read
    Get to know Ryazan University in Russia! Complete guide for 2026 study applicants
    RCO Daily News
    Immigration

    Complete review of Ryazan universities for study applicas and educational immigrationStudying in Russia in rece years has become one of…

    11 Min Read
    ca PGWP golden tips that most Canadian students don’t know
    Canada

    🔟 Field of Study conditionThis condition applies to people who:are exempted:Other programs must be on the list of governme-approved fields…

    1 Min Read
    ca
    Canada

    Example 2: Obtaining permane residence through the Atlaic Immigration Program (AIP)Clara's Status: Clara is 24 years old and is working…

    2 Min Read
    A detailed comparison of Russia and China for education and immigration, an analytical and realistic guide to the decision that will shape your future
    RCO Daily News
    Immigration

    Iroduction Why is this comparison not simple?For many educational immigration applicas, choosing the destination coury is no longer an idealistic…

    8 Min Read
    Check out more:
    • Canada
  • Travel
    Conditions for buying bus tickets Booking guide and bus travel rules
    Conditions for buying bus tickets
    Dubai

    Buy bus tickets online Today, it has become one of the main methods of planning iercity trips, but many travelers…

    18 Min Read
    Introduction of the silver beach of Hormuz (access route + accommodation)
    Hormuz Silver Beach
    Dubai

    Why do some beaches stay in the traveler's mind and not just a poi on the map? The answer can…

    18 Min Read
    Al Habtoor Palace Dubai Hotel
    RCO Daily News
    Dubai

    Iroduction of Sun and Sand Hotel Dowown Dubai Stay in the heart of Deira near Al-Raqqa Streethotel Sun and Sand…

    2 Min Read
    Traffic police: Chalus road, Tehran freeway to the north and Pardis became one-way
    RCO Daily News
    Travel

    Sardar Seyedtimur Hosseini added: Also, due to the high traffic density, the south-north routes of Tehran-Shamal and Karaj-Chalus freeways are…

    2 Min Read
    Swissôtel Al Ghurair, Dubai
    RCO Daily News
    Dubai

    I work in "Go to Dubai" with a focus on accurate, clear and tailored trips for Iranian travelers. Our goal…

    3 Min Read
    Check out more:
    • Dubai
  • Technology
    ChatGPT’s safety rules need to be revised
    RCO Daily News
    artificial-intelligence

    The Canadian governme summoned OpenAI executives to Ottawa following a deadly school shooting in British Columbia. Governme officials criticized the…

    3 Min Read
    Ethereum time bomb at the border of 2 dollars and the possibility of a historic explosion!
    cryptocurrency

    minutesThe price of Ethereum (ETH) has finally managed to return above the psychological level of $2,000 after weeks of selling…

    1 Min Read
    New Qwen 3.5 open source models released; Suitable for running on personal systems
    RCO Daily News
    artificial-intelligence

    Alibaba's artificial ielligence developme team iroduced the new Qwen 3.5 series of language models, which brings the features of advanced…

    3 Min Read
    The Perplexity Computer platform was introduced
    RCO Daily News
    artificial-intelligence

    Perplexity company from the new platform Computer Perplexity unveiled, which is considered a big step in the evolution of artificial…

    3 Min Read
    Nano Banana 2 model was introduced; Google’s strongest artificial intelligence
    RCO Daily News
    artificial-intelligence

    Google's new and powerful artificial ielligence called its imager Nano banana 2 (Nano Banana 2) iroduced and released for free…

    3 Min Read
    Check out more:
    • Artificial Intelligence
    • CryptoCurrency
    • Gadgets
  • Fashion
    FashionShow More
    Women's short homemade cotton shirt
    Women’s short home cotton shirt

    The fact is that men are not complicated; Your wife will love…

    9 Min Read
    Perfume and its effect on sleep
    What effect does putting on perfume before sleep have on the quality of sleep? • Image of life magazine

    Sces have a direct effect on the nervous system and human emotions…

    7 Min Read
    The difference between a stylish and up-to-date make-up and a messy make-up

    Makeup, just like clothes, depends on the time and taste of the…

    9 Min Read
    The best sport type with a tie
    The best sport type with a tie; How to look well-dressed? • Image of life magazine

    A sporty look with a tie looks good for people who wa…

    8 Min Read
    What is the best Valentine set? All kinds of ideas for buying gift sets on the day of love

    As Valeine's Day approaches, choosing a differe and lasting gift becomes one…

    9 Min Read
  • Health
    HealthShow More
    Food list for treating high blood fat
    Food list for treating high blood fat

    Food list and sample diet for treating blood lipidsIn today's world, high…

    13 Min Read
    Anal warts
    What is an anal wart? | Symptoms, ways of transmission, methods of treatment and prevention of HPV

    Anal wart or condyloma acuminata is one of the most common sexually…

    11 Min Read
    Serious side effects of curling hair
    Serious complications of hair curling + the correct way to curl to maintain hair health

    Hair curling and its damageThe serious side effects of curling hair and…

    6 Min Read
    Choice of laminate or composite
    Is the durability of laminate better or composite?

    The beauty of a smile plays an undeniable role in people's self-confidence…

    14 Min Read
    Benefits of cold shower for men and women
    Benefits of cold showers for men and women + benefits of cold showers for the skin

    Benefits of cold showerCold water shower may seem a little hard at…

    9 Min Read
  • Science
    ScienceShow More
    RCO Daily News
    The condolence message of the Institute of Seismology and Earthquake Engineering after the martyrdom of the leader of the revolution – Mehr news agency RCO News Agency

    According to RCO News Agency, the researchers of the Iernational Research Institute…

    1 Min Read
    Strengthening basic sciences is a prerequisite for improving the university’s research position – RCO News Agency

    According to RCO News Agency, citing Amirkabir University of Technology, Abbas Soroush…

    3 Min Read
    Allocation of 1,500 billion Rials to support cultural and artistic startups – RCO News Agency

    According to Mehr news agency, Seyyed Mehdi Sadat Hayatshahi, secretary of the…

    5 Min Read
    The deadline for sending articles to Royan twin congresses has been announced
    The deadline for sending articles to Royan twin congresses has been announced – RCO News Agency

    According to RCO News Agency, citing Royan Research Institute, the 27th Iernational…

    2 Min Read
    The launch of the “National Elite Foundation Proposal System” in the near future – RCO News Agency

    According to Mehr news agency, quoting from the National Elite Foundation, Rasul…

    3 Min Read
  • World
    WorldShow More
    RCO Daily News
    The simultaneous attacks of Iran and Lebanon’s Hezbollah on the occupied territories – Mehr news agency RCO News Agency

    The simultaneous attacks of Iran and Lebanon's Hezbollah on the occupied territories…

    1 Min Read
    RCO Daily News
    Clash in the US Senate about the war against Iran + video – Mehr news agency RCO News Agency

    Clash in the US Senate about the war against Iran + video…

    1 Min Read
    RCO Daily News
    Zionist media: The crash of an American F-15 fighter jet in the west of Iran – Mehr news agency RCO News Agency

    Zionist media: The crash of an American F-15 fighter jet in the…

    1 Min Read
    RCO Daily News
    An explosion occurred near an oil tanker off the coast of Kuwait – Mehr news agency RCO News Agency

    An explosion occurred near an oil tanker off the coast of Kuwait…

    1 Min Read
    RCO Daily News
    Consultation of the US Minister of Foreign Affairs with his Turkish and Saudi counterparts – Mehr news agency RCO News Agency

    Consultation of the US Minister of Foreign Affairs with his Turkish and…

    2 Min Read
Reading: Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
Share

RCO NEWS Daily world news agency Based on Dubai, UAE

RCO News Helpline News Directory based on Dubai

Aa
  • Immigration
  • Travel
  • Technology
  • Science
  • Fashion
Search
  • Home
  • Immigration News
    • Canada
  • Technology News
    • Gadgets
    • cryptocurrency
  • Travel News
    • Dubai
  • Fashion News
  • World News
  • Bookmarks
  • Sitemap
Have an existing account? Sign In
Follow US
© 2023 RCO News Network. Studio TEDSA Design Company. All Rights Reserved.
RCO NEWS Daily world news agency Based on Dubai, UAE > Blog > Technology > artificial-intelligence > Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
artificial-intelligence

Learning Reinforcement, Way to Comprehensive Artificial Intelligence?

IT Technology
Last updated: 2026/03/15 at 3:58 PM
IT Technology
Share
SHARE

Contents
When cars became Master of Chess and GORLHF and the role of human in chatting trainingIs it a bridge toward artificial ielligence or mirage?

Hamid Reza Mazandarani, a network and artificial ielligence researcher, has examined the history, hidden role and reinforceme learning challenges in artificial ielligence in an exclusive note written for Digiato.

Reinforceme learning has been a high -profile way over the past few decades, a way that today looks more eye -catching and eye -catching than ever before. But where does this path go and what destination can be expected? The following note takes a brief look at these questions.

Reinforced learning, following the ieraction with the environme and receiving appropriate rewards, modifies its parameters. In other words, the data is made up, without the inhere need for label and ready -made educational data. This approach is considered as a compleme to conveional learning, especially for decision -making issues that are sometimes unclear in any situation.

Two scieists, Richard Satin and Andrew Barto, founded the scieific framework of reinforceme learning, as we know today, in the late 1980s. Of course, the ideas of those years, in the early tweieth ceury, were inveed by psychologists. You may have heard the name of the famous “Skinner Box” in which the animals were learned to receive food by pressing the lever.

Famous “Skinner Box” test to check the animal’s response to reward (Reference: Forbes)

Later, however, psychologists found that learning is an over -elemeary model to describe human and even animals. Its famous example is the phenomenon of “learned helplessness”, whereby living beings under the frustrating conditions do not attempt to maximize rewards, as it expects reinforceme learning.

When cars became Master of Chess and GO

However, the main obstacle to reinforceme in the world of artificial ielligence was from another sex: the need for many ieractions with the environme to behave slightly better than a random factor. In the second half of the last decade, a combination of hardware progress, the emergence of deep learning, as well as the provision of more efficie algorithms, has partially eliminated this obstacle. As a result, conditions were provided for Deep Madam to defeat the GO chess champions and play with its smart models. These models came up with millions of games with themselves (called Self-Play).

Now all the evidence suggested that learning to strengthen the star of the sky would be artificial ielligence, but the story we differe: the language models that were trained based on the prediction of the text formed a revolution that transformed human life. These days, ChatGT and his competitors have become an iegral part of people’s lives around the world, and even talking about improving their ability in the form of “smart agency”.

But what came about reinforcing learning? It is ieresting to know that reinforceme learning has also coributed to the evolution of language models. In fact, the problem with the initial language models was that they were not ready to talk to humans. But by teaching these models in the form of reinforceme learning and rewarding their responses, the basis for more consiste models was provided with users’ demands.

RLHF and the role of human in chatting training

In 2017, Deepmand expanded a method that is a RLHF algorithm (human feeder learning) in collaboration with Openai. In the algorithm, human users choose the more useful and safer option between the two answers produced by the language model. With these choices, a reward model is taught that is the basis of the main model training. In a way, the reward model acts as a referee or critic for the language model.

While RLHF makes a reinforceme learning on the original model, scieists were not convinced and developed other ideas that do not require a human user at all. The result was the inveion of methods like RLVR (reinforceme learning with verifiable rewards) that reward the language model based on the correct answer. The correct answer can be the output of a piece of programming code or the final answer of a mathematical issue. From now on, whenever your model helps you in coding, remember that the model is not only with the prediction of the text, but by trying to find the correct coding answers.

Now we may be tempted to claim that the artificial ielligence is human or beyond near, because the right rewards can make the models more powerful day by day. In 2021, several researchers (including Richard Satin) preseed an article eitled “Sufficie Rewards” that somehow followed the same line of thought. It may be theoretically, but there are serious challenges in practice.

Many human issues, such as managerial counseling, or writing a few lines of poetry, do not have a measurable reward. In response to this challenge, some seek to develop RLAIF (Empowerme Learning with Artificial Ielligence Rewards), which uses artificial ielligence to reward the language model.

Is it a bridge toward artificial ielligence or mirage?

Even if efforts are to build a comprehensive reward model that tells the language model the text it produced exactly how well “, scalability, the old problem of reinforceme learning again; In particular, the curre models are equipped with “reasoning”, meaning that they produce several times to reach the final output, which means more resources consumption.

However, will we learn to reinforce our comprehensive artificial ielligence (Agi)? This is a difficult question in several respects. First, many believe that we have nothing called “Comprehensive Artificial Ielligence”. If artificial ielligence is considered at the human level, in some respects, one now has nothing to say to artificial ielligence. If the skills are to achieve homogeneity and equilibrium among the skills? So uil the destination is precisely clear, it is meaningless to measure the distance with it.

Another challenge is that the research and developme process is evolving without a single thinker. Dipmand has been criticized after the adve of language models, gambling on reinforceme learning; And if history had been repeated, it would never have invested in this area, and we would have been deprived of its progress. So the question of what to do depends on the decisions of researchers and investors, not the inhere capabilities of technologies!

Finally, it should not be forgotten that research has always been able to surprise us and may emerge a new technology, or an old idea will re -live a new life and abandon the reinforceme learning (or better reinforce it!).

RCO NEWS

Fast VPN

TAGGED: artificial, comprehensive, intelligence, learning, reinforcement
IT Technology March 15, 2026 March 15, 2026
Share This Article
Twitter Email Copy Link Print
What is your Reaction in this page of RCO News?
Love0
Sad0
Happy0
Sleepy0
Angry0
Surprise0
Previous Article sui-eyes-3-90-as-supply-shrinks-and-bulls-defend-crucial-support Reducing millions of supply and increased purchasing pressure; The price is ready to jump again?
Next Article Yemen’s message to Saudi Arabia: Nothing remains hidden from our eyes – Mehr News Agency | Iranian and world news
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

UAE immigration
Dominica immigration
Spain immigration

Latest Passing over countries : Spain | Dominica | United Arab Emirates

Editor's Pick

Buying a business in Canada: a comprehensive guide and introduction to the best areas

7minutes Buying a business in Canada: a comprehensive guide and iroduction to the best areasWith its stable economy, strong banking…

By Editor-in-chief of Canada 8 Min Read
Dubai metro map 2024
Dubai Metro Map 2024 from introduction to (new download)

Iroducing the Dubai Metro and downloading the map of 2024 metro lines…

28 Min Read
Burj Al Arab restaurants
Burj Al Arab restaurants Instant booking 2024

Purchase options Al lwan restaura – lunch or dinnerSahn Eddar Cafe –…

2 Min Read

Top Writers

Editor-in-chief of Canada 647 Articles
We at Canada RCO News Observatory are responsible for gathering…
Avatar
TakeOff 4161 Articles
We at RCO NEWS for Travelers of the Takeoff travel…
Avatar

Oponion

Women's short homemade cotton shirt

Women’s short home cotton shirt

The fact is that men are not complicated; Your wife…

February 27, 2026

You Might Also Like

RCO Daily News
artificial-intelligence

ChatGPT’s safety rules need to be revised

The Canadian governme summoned OpenAI executives to Ottawa following a deadly school shooting in British Columbia. Governme officials criticized the…

3 Min Read
RCO Daily News
artificial-intelligence

New Qwen 3.5 open source models released; Suitable for running on personal systems

Alibaba's artificial ielligence developme team iroduced the new Qwen 3.5 series of language models, which brings the features of advanced…

3 Min Read
RCO Daily News
artificial-intelligence

The Perplexity Computer platform was introduced

Perplexity company from the new platform Computer Perplexity unveiled, which is considered a big step in the evolution of artificial…

3 Min Read
RCO Daily News
artificial-intelligence

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Google's new and powerful artificial ielligence called its imager Nano banana 2 (Nano Banana 2) iroduced and released for free…

3 Min Read
Rco News Light Rco News Logo Dark

Other News

  • Science
  • Fashion
  • Business

Technology

  • Technology News
  • artificial-intelligence
  • cryptocurrency
  • Gadgets

Immigration

  • Immigration News
  • Canada

Travel

  • Travel News
  • Dubai

More

  • Advertise
  • Contact

Subscribe

  • Dubai Company
  • TEDSA HOLDING
  • Nobel Cert Universal

© RCO News Network. By Studio TEDSA HOLDING. All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?