Sign In
  • U.S.
  • Canada
  • Australia
  • Dubai
  • UAE
  • Dominica
  • Spain
Rco News Light Rco News Logo Dark
  • Home
  • Immigration
    New ways to get Canadian permanent residence through Express Entry 2026
    Canada

    Federal government Canada Officially announced that from 2026 some new job categories in the system Express Entry It prioritizes selection…

    1 Min Read
    Get to know Ryazan University in Russia! Complete guide for 2026 study applicants
    RCO Daily News
    Immigration

    Complete review of Ryazan universities for study applicants and educational immigrationStudying in Russia in recent years has become one of…

    11 Min Read
    ca PGWP golden tips that most Canadian students don’t know
    Canada

    🔟 Field of Study conditionThis condition applies to people who:are exempted:Other programs must be on the list of government-approved fields…

    1 Min Read
    ca
    Canada

    Example 2: Obtaining permanent residence through the Atlantic Immigration Program (AIP)Clara's Status: Clara is 24 years old and is working…

    2 Min Read
    A detailed comparison of Russia and China for education and immigration, an analytical and realistic guide to the decision that will shape your future
    RCO Daily News
    Immigration

    Introduction Why is this comparison not simple?For many educational immigration applicants, choosing the destination country is no longer an idealistic…

    8 Min Read
    Check out more:
    • Canada
  • Travel
    Conditions for buying bus tickets Booking guide and bus travel rules
    Conditions for buying bus tickets
    Dubai

    Buy bus tickets online Today, it has become one of the main methods of planning intercity trips, but many travelers…

    18 Min Read
    Introduction of the silver beach of Hormuz (access route + accommodation)
    Hormuz Silver Beach
    Dubai

    Why do some beaches stay in the traveler's mind and not just a point on the map? The answer can…

    18 Min Read
    Al Habtoor Palace Dubai Hotel
    RCO Daily News
    Dubai

    Introduction of Sun and Sand Hotel Downtown Dubai Stay in the heart of Deira near Al-Raqqa Streethotel Sun and Sand…

    2 Min Read
    Traffic police: Chalus road, Tehran freeway to the north and Pardis became one-way
    RCO Daily News
    Travel

    Sardar Seyedtimur Hosseini added: Also, due to the high traffic density, the south-north routes of Tehran-Shamal and Karaj-Chalus freeways are…

    2 Min Read
    Swissôtel Al Ghurair, Dubai
    RCO Daily News
    Dubai

    I work in "Go…

    3 Min Read
    Check out more:
    • Dubai
  • Technology
    ChatGPT’s safety rules need to be revised
    RCO Daily News
    artificial-intelligence

    The Canadian government summoned OpenAI executives to Ottawa following a deadly school shooting in British Columbia. Government officials criticized the…

    3 Min Read
    Ethereum time bomb at the border of 2 dollars and the possibility of a historic explosion!
    cryptocurrency

    study time: minutesThe price of Ethereum (ETH) has finally managed to return above the psychological level of $2,000 after weeks…

    1 Min Read
    New Qwen 3.5 open source models released; Suitable for running on personal systems
    RCO Daily News
    artificial-intelligence

    Alibaba's artificial intelligence development team introduced the new Qwen 3.5 series of language models, which brings the features of advanced…

    3 Min Read
    The Perplexity Computer platform was introduced
    RCO Daily News
    artificial-intelligence

    Perplexity company from the new platform Computer Perplexity unveiled, which is considered a big step in the evolution of artificial…

    3 Min Read
    Nano Banana 2 model was introduced; Google’s strongest artificial intelligence
    RCO Daily News
    artificial-intelligence

    Google's new and powerful artificial intelligence called its imager Nano banana 2 (Nano Banana 2) introduced and released for free…

    3 Min Read
    Check out more:
    • Artificial Intelligence
    • CryptoCurrency
    • Gadgets
  • Fashion
    FashionShow More
    Women's short homemade cotton shirt
    Women’s short home cotton shirt

    The fact is that men are not complicated; Your wife will love…

    9 Min Read
    Perfume and its effect on sleep
    What effect does putting on perfume before sleep have on the quality of sleep? • Image of life magazine

    Scents have a direct effect on the nervous system and human emotions…

    7 Min Read
    The difference between a stylish and up-to-date make-up and a messy make-up

    Makeup, just like clothes, depends on the time and taste of the…

    9 Min Read
    The best sport type with a tie
    The best sport type with a tie; How to look well-dressed? • Image of life magazine

    A sporty look with a tie looks good for people who want…

    8 Min Read
    What is the best Valentine set? All kinds of ideas for buying gift sets on the day of love

    As Valentine's Day approaches, choosing a different and lasting gift becomes one…

    9 Min Read
  • Health
    HealthShow More
    Food list for treating high blood fat
    Food list for treating high blood fat

    Food list and sample diet for treating blood lipidsIn today's world, high…

    13 Min Read
    Anal warts
    What is an anal wart? | Symptoms, ways of transmission, methods of treatment and prevention of HPV

    Anal wart or condyloma acuminata is one of the most common sexually…

    11 Min Read
    Serious side effects of curling hair
    Serious complications of hair curling + the correct way to curl to maintain hair health

    Hair curling and its damageThe serious side effects of curling hair and…

    6 Min Read
    Choice of laminate or composite
    Is the durability of laminate better or composite?

    The beauty of a smile plays an undeniable role in people's self-confidence…

    14 Min Read
    Benefits of cold shower for men and women
    Benefits of cold showers for men and women + benefits of cold showers for the skin

    Benefits of cold showerCold water shower may seem a little hard at…

    9 Min Read
  • Science
    ScienceShow More
    RCO Daily News
    The condolence message of the Institute of Seismology and Earthquake Engineering after the martyrdom of the leader of the revolution – Mehr news agency RCO News Agency

    According to RCO News Agency, the researchers of the International Research Institute…

    1 Min Read
    Strengthening basic sciences is a prerequisite for improving the university’s research position – RCO News Agency

    According to RCO News Agency, citing Amirkabir University of Technology, Abbas Soroush…

    3 Min Read
    Allocation of 1,500 billion Rials to support cultural and artistic startups – RCO News Agency

    According to Mehr news agency, Seyyed Mehdi Sadat Hayatshahi, secretary of the…

    5 Min Read
    The deadline for sending articles to Royan twin congresses has been announced
    The deadline for sending articles to Royan twin congresses has been announced – RCO News Agency

    According to RCO News Agency, citing Royan Research Institute, the 27th International…

    2 Min Read
    The launch of the “National Elite Foundation Proposal System” in the near future – RCO News Agency

    According to Mehr news agency, quoting from the National Elite Foundation, Rasul…

    3 Min Read
  • World
    WorldShow More
    RCO Daily News
    American official: Israel is apartheid; Helping it should be reconsidered – Mehr news agency RCO News Agency

    American official: Israel is apartheid; Helping it should be reconsidered - Mehr…

    2 Min Read
    RCO Daily News
    Robert Mali: America didn’t even take action to protect its own people – Mehr news agency RCO News Agency

    Robert Mali: America didn't even take action to protect its own people…

    2 Min Read
    Russia’s defense response to Ukraine’s drone attacks – Mehr News Agency | RCO News Agency

    Russia's defense response to Ukraine's drone attacks - Mehr News Agency |…

    2 Min Read
    Postponing US military support to Taiwan – Mehr News Agency | RCO News Agency

    Postponing US military support to Taiwan - Mehr News Agency | RCO…

    2 Min Read
    Demonstration of UAE mercenaries in Yemen against Saudi Arabia – Mehr news agency RCO News Agency

    Demonstration of UAE mercenaries in Yemen against Saudi Arabia - Mehr News…

    2 Min Read
Reading: Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
Share

RCO NEWS Daily world news agency Based on Dubai, UAE

RCO News Helpline News Directory based on Dubai

Aa
  • Immigration
  • Travel
  • Technology
  • Science
  • Fashion
Search
  • Home
  • Immigration News
    • Canada
  • Technology News
    • Gadgets
    • cryptocurrency
  • Travel News
    • Dubai
  • Fashion News
  • World News
  • Bookmarks
  • Sitemap
Have an existing account? Sign In
Follow US
© 2023 RCO News Network. Studio TEDSA Design Company. All Rights Reserved.
RCO NEWS Daily world news agency Based on Dubai, UAE > Blog > Technology > artificial-intelligence > Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
artificial-intelligence

Learning Reinforcement, Way to Comprehensive Artificial Intelligence?

IT Technology
Last updated: 2026/03/04 at 12:29 PM
IT Technology
Share
SHARE


Contents
When cars became Master of Chess and GORLHF and the role of human in chatting trainingIs it a bridge toward artificial intelligence or mirage?

Hamid Reza Mazandarani, a network and artificial intelligence researcher, has examined the history, hidden role and reinforcement learning challenges in artificial intelligence in an exclusive note written for Digiato.

Reinforcement learning has been a high -profile way over the past few decades, a way that today looks more eye -catching and eye -catching than ever before. But where does this path go and what destination can be expected? The following note takes a brief look at these questions.

Reinforced learning, following the interaction with the environment and receiving appropriate rewards, modifies its parameters. In other words, the data is made up, without the inherent need for label and ready -made educational data. This approach is considered as a complement to conventional learning, especially for decision -making issues that are sometimes unclear in any situation.

Two scientists, Richard Satin and Andrew Barto, founded the scientific framework of reinforcement learning, as we know today, in the late 1980s. Of course, the ideas of those years, in the early twentieth century, were invented by psychologists. You may have heard the name of the famous “Skinner Box” in which the animals were learned to receive food by pressing the lever.

Famous “Skinner Box” test to check the animal’s response to reward (Reference: Forbes)

Later, however, psychologists found that learning is an over -elementary model to describe human and even animals. Its famous example is the phenomenon of “learned helplessness”, whereby living beings under the frustrating conditions do not attempt to maximize rewards, as it expects reinforcement learning.

When cars became Master of Chess and GO

However, the main obstacle to reinforcement in the world of artificial intelligence was from another sex: the need for many interactions with the environment to behave slightly better than a random factor. In the second half of the last decade, a combination of hardware progress, the emergence of deep learning, as well as the provision of more efficient algorithms, has partially eliminated this obstacle. As a result, conditions were provided for Deep Madam to defeat the GO chess champions and play with its smart models. These models came up with millions of games with themselves (called Self-Play).

Now all the evidence suggested that learning to strengthen the star of the sky would be artificial intelligence, but the story went different: the language models that were trained based on the prediction of the text formed a revolution that transformed human life. These days, ChatGT and his competitors have become an integral part of people’s lives around the world, and even talking about improving their ability in the form of “smart agency”.

But what came about reinforcing learning? It is interesting to know that reinforcement learning has also contributed to the evolution of language models. In fact, the problem with the initial language models was that they were not ready to talk to humans. But by teaching these models in the form of reinforcement learning and rewarding their responses, the basis for more consistent models was provided with users’ demands.

RLHF and the role of human in chatting training

In 2017, Deepmand expanded a method that is a RLHF algorithm (human feeder learning) in collaboration with Openai. In the algorithm, human users choose the more useful and safer option between the two answers produced by the language model. With these choices, a reward model is taught that is the basis of the main model training. In a way, the reward model acts as a referee or critic for the language model.

While RLHF makes a reinforcement learning on the original model, scientists were not convinced and developed other ideas that do not require a human user at all. The result was the invention of methods like RLVR (reinforcement learning with verifiable rewards) that reward the language model based on the correct answer. The correct answer can be the output of a piece of programming code or the final answer of a mathematical issue. From now on, whenever your model helps you in coding, remember that the model is not only with the prediction of the text, but by trying to find the correct coding answers.

Now we may be tempted to claim that the artificial intelligence is human or beyond near, because the right rewards can make the models more powerful day by day. In 2021, several researchers (including Richard Satin) presented an article entitled “Sufficient Rewards” that somehow followed the same line of thought. It may be theoretically, but there are serious challenges in practice.

Many human issues, such as managerial counseling, or writing a few lines of poetry, do not have a measurable reward. In response to this challenge, some seek to develop RLAIF (Empowerment Learning with Artificial Intelligence Rewards), which uses artificial intelligence to reward the language model.

Is it a bridge toward artificial intelligence or mirage?

Even if efforts are to build a comprehensive reward model that tells the language model the text it produced exactly how well “, scalability, the old problem of reinforcement learning again; In particular, the current models are equipped with “reasoning”, meaning that they produce several times to reach the final output, which means more resources consumption.

However, will we learn to reinforce our comprehensive artificial intelligence (Agi)? This is a difficult question in several respects. First, many believe that we have nothing called “Comprehensive Artificial Intelligence”. If artificial intelligence is considered at the human level, in some respects, one now has nothing to say to artificial intelligence. If the skills are to achieve homogeneity and equilibrium among the skills? So until the destination is precisely clear, it is meaningless to measure the distance with it.

Another challenge is that the research and development process is evolving without a single thinker. Dipmand has been criticized after the advent of language models, gambling on reinforcement learning; And if history had been repeated, it would never have invested in this area, and we would have been deprived of its progress. So the question of what to do depends on the decisions of researchers and investors, not the inherent capabilities of technologies!

Finally, it should not be forgotten that research has always been able to surprise us and may emerge a new technology, or an old idea will re -live a new life and abandon the reinforcement learning (or better reinforce it!).

Dubai company setup

Latest Passing over countries : Spain | Dominica | United Arab Emirates



RCO NEWS

RCO News

We are glad that you have decided to subscribe to our email list.
Please take a few seconds to fill out the listing details to join our list.
You will receive an email to confirm your subscription, just to make sure it is your email address.
TAGGED: artificial, comprehensive, intelligence, learning, reinforcement
IT Technology March 4, 2026 March 4, 2026
Share This Article
Twitter Email Copy Link Print
What is your Reaction in this page of RCO News?
Love0
Sad0
Happy0
Sleepy0
Angry0
Surprise0
Previous Article sui-eyes-3-90-as-supply-shrinks-and-bulls-defend-crucial-support Reducing millions of supply and increased purchasing pressure; The price is ready to jump again?
Next Article Yemen’s message to Saudi Arabia: Nothing remains hidden from our eyes – Mehr News Agency | Iranian and world news
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

UAE immigration
Dominica immigration
Spain immigration

Latest Passing over countries : Spain | Dominica | United Arab Emirates

Editor's Pick

RCO Daily News

Guide to starting a business in Canada

One of the ways to get Canadian residency is to start a business in this country. You can start your…

By Editor-in-chief of Canada 2 Min Read
RCO Daily News
Minimum IELTS score for Germany – Travel Tours

nn. Checking the minimum IELTS score for GermanyMinimum IELTS score for Germany…

10 Min Read
RCO Daily News
Get to know the record holder of seedling production in Semnan – Travel Tours

nnThe country of Iran is built as a country of four seasons,…

9 Min Read

Top Writers

Editor-in-chief of Canada 647 Articles
We at Canada RCO News Observatory are responsible for gathering…
Avatar
TakeOff 4161 Articles
We at RCO NEWS for Travelers of the Takeoff travel…
Avatar

Oponion

Women's short homemade cotton shirt

Women’s short home cotton shirt

The fact is that men are not complicated; Your wife…

February 27, 2026

You Might Also Like

RCO Daily News
artificial-intelligence

ChatGPT’s safety rules need to be revised

The Canadian government summoned OpenAI executives to Ottawa following a deadly school shooting in British Columbia. Government officials criticized the…

3 Min Read
RCO Daily News
artificial-intelligence

New Qwen 3.5 open source models released; Suitable for running on personal systems

Alibaba's artificial intelligence development team introduced the new Qwen 3.5 series of language models, which brings the features of advanced…

3 Min Read
RCO Daily News
artificial-intelligence

The Perplexity Computer platform was introduced

Perplexity company from the new platform Computer Perplexity unveiled, which is considered a big step in the evolution of artificial…

3 Min Read
RCO Daily News
artificial-intelligence

Nano Banana 2 model was introduced; Google’s strongest artificial intelligence

Google's new and powerful artificial intelligence called its imager Nano banana 2 (Nano Banana 2) introduced and released for free…

3 Min Read
Rco News Light Rco News Logo Dark

Other News

  • Science
  • Fashion
  • Business

Technology

  • Technology News
  • artificial-intelligence
  • cryptocurrency
  • Gadgets

Immigration

  • Immigration News
  • Canada

Travel

  • Travel News
  • Dubai

More

  • Advertise
  • Contact

Subscribe

  • Dubai Company
  • TEDSA HOLDING
  • Nobel Cert Universal

© RCO News Network. By Studio TEDSA HOLDING. All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?