Sign In
  • U.S.
  • Canada
  • Australia
  • Dubai
  • UAE
  • Dominica
  • Spain
RCO News Logo Light RCO News Logo
  • Home
  • Immigration
    Canadian immigration program 2026 to 2028
    Canada

    In his September 2025 speech in Edmonton, Prime Minister Mark Carney emphasized that Canada's Temporary Foreign Worker Program (TFWP) should…

    1 Min Read
    2025 US Lottery Funding Amount
    US lottery winners must prove that they have enough financial power so that they do not become a financial burden to the US government.
    Immigration

    Winning the US lottery is undoubtedly an exciting and fateful event. This success opens the door to new opportunities, but…

    24 Min Read
    Programming in Oman 2026 Esthetebanati Institute
    Hiring a programmer in Oman
    Immigration

    Oman is one of the leading countries in the information technology sector. This has caused golden opportunities for Iranians to…

    22 Min Read
    Programming in Germany 2025
    Programming job market conditions in Germany
    Immigration

    Germany is a leading country in the field of science, technology and engineering and is a great country to immigrate…

    21 Min Read
    What is legalization of documents? 2025 Esthetebanati Institute
    What is the legalization of documents?
    Immigration

    In the world of immigration, studying or working abroad, one of the important and sometimes complicated steps is "legalization" of…

    21 Min Read
    Check out more:
    • Canada
  • Travel
    How can I find out about Istanbul tour services?
    Travel

    Istanbul, a city spread across two continents, Asia and Europe, is a treasure for a distinctive tourist trip. On the…

    6 Min Read
    Thieves attacked the Syrian National Museum; Roman statues were stolen!

    According to the Associated Press; Thieves broke into the Syrian National Museum in Damascus and stole several ancient statues belonging…

    2 Min Read
    Damak Dubai real estate architecture (descriptions + features)
    Architectural style of Damac Dubai properties
    Travel

    The features and architecture of DAMAC Dubai properties are a combination of luxury, innovation, comfort and great location; Damak towers…

    9 Min Read
    where is bonab | The best sights and access route
    A view of Bonab city
    Dubai

    where is bonab This question is exactly of the same kind of curiosities that arise when we talk about the…

    16 Min Read
    The discovery of the hidden entrance in the Egyptian pyramids; Will the 4500-year-old secret be revealed?

    According to Live Science; Using advanced scanning technologies, a team of researchers has identified two significant anomalies or "empty spaces"…

    3 Min Read
    Check out more:
    • Dubai
  • Technology
    Stop free data mining
    Wikipedia's request from AI companies
    artificial-intelligence

    The Wikimedia Foundation, the non-profit organization that hosts Wikipedia, has called on artificial intelligence companies to stop free-scraping data from…

    3 Min Read
    Homayun Ershadi, the main character of Kiarostami’s Taste of Cherry, died at the age of 78
    Ershadi Kiarostami Mehrjooi
    artificial-intelligence

    Homayoun Ershadi, an Iranian actor who is best known for his roles in Abbas Kiarostami's "Taste of Cherry" and Dariush…

    4 Min Read
    Artificial intelligence will create more millionaires than the internet
    Jensen Huang
    artificial-intelligence

    Jensen Huang, the CEO of Nvidia, said in his latest interview that artificial intelligence will create more millionaires in the…

    2 Min Read
    Reports indicate a delay in the release of the next generation iPhone Air
    Technology

    It seems that Apple is reviewing the strategy related to the iPhone Air series; A decision that will probably lead…

    3 Min Read
    12 years have passed and GTA 5 is still breaking records; This new achievement is incredible
    GTA 5 game
    Gadgets

    Posted by: Amir Abbas Karimi November 20, 1404 at 12:14More than a decade after its release, GTA 5 continues to…

    2 Min Read
    Check out more:
    • Artificial Intelligence
    • CryptoCurrency
    • Gadgets
  • Fashion
    FashionShow More
    What is the difference between sleeper and crux + photo

    Many people hesitate when buying comfortable slippers between two popular models, Slipper…

    10 Min Read
    Stylish and special coat
    If you are looking for a stylish and special coat model, don’t miss these models! • Image of life magazine

    In this article, you will get to know the latest models of…

    15 Min Read
    The most stylish and latest models of linen pants
    The most stylish types of linen pants 😍 What to pair linen pants with?

    Linen pants have become very popular among men and women and are…

    11 Min Read
    Thick embroidery is the process of sewing thick and heavy fabrics such as denim.

    Thick embroidery is one of the key skills in the art of…

    17 Min Read
    Full set of fancy satin pajamas
    40 models of the newest and most stylish models of fantasy pajamas

    One of the most important types of accessories in clothing is women's…

    6 Min Read
  • Health
    HealthShow More
    Cefixime 400 tablets for vaginal infection
    How to take Cefixime 400 tablets for vaginal infection + Cefixime every few hours

    Cefixime 400 tablets for vaginal infectionCefixime 400 tablets are one of the…

    7 Min Read
    Snoring in sleep and its treatment at home
    What is the sign of snoring in sleep? What is the disease? Snoring treatment at home

    Snoring in sleep and its treatment at homeSnoring is one of the…

    7 Min Read
    The cause of hip and groin pain in women
    What is the cause of hip and groin pain in women on the left and right side? + Treatment

    The cause of hip and groin pain in womenPelvic pain is generally…

    8 Min Read
    The cause of ear burning
    The cause of left and right ear heating and head heaviness

    The cause of ear burningLeft and right ear burning and head heaviness…

    12 Min Read
    The cause of heartburn and its treatment
    What is the cause of burning scalp and hair loss? + Ways of treatment

    The cause of heartburn and its treatmentItchy scalp and hair loss is…

    10 Min Read
  • Science
    ScienceShow More
    Representatives of scientific associations appear in specialized commissions of the government – RCO News Agency

    In a conversation with Mehr reporter, Seyyed Abdul Amir Nabovi said about…

    3 Min Read
    Returning to scientific acceleration courses requires special attention to human resources and effective financing

    The Secretary of the Science and Technology Headquarters of the Supreme Council…

    3 Min Read
    Analysis of biological data and pathology images with artificial intelligence – RCO News Agency

    According to Mehr news agency, quoting Tehran University, a group of researchers…

    3 Min Read
    A new method that corrects multiple genetic mutations at the same time

    A new method of gene editing enables the simultaneous correction of multiple…

    7 Min Read
    Iran’s National Science Promotion System was unveiled – Mehr news agency RCO News Agency

    According to the report of Mehr reporter, Mohammad Hassanzadeh, at the ceremony…

    6 Min Read
  • World
    WorldShow More
    The theft of 6 ancient statues from the National Museum of Damascus

    The theft of 6 ancient statues from the National Museum of Damascus…

    1 Min Read
    Banning the participation of 8 Zionist companies in the Paris security exhibition – Mehr news agency RCO News Agency

    Banning the participation of 8 Zionist companies in the Paris security exhibition…

    1 Min Read
    CNN: Britain suspends information sharing with US on Caribbean boats

    CNN: Britain suspends information sharing with US on Caribbean boats The United…

    4 Min Read
    Tokayev’s opinion about the railway between Russia, Iran and Kazakhstan – Mehr news agency RCO News Agency

    Tokayev's opinion about the railway between Russia, Iran and Kazakhstan - Mehr…

    1 Min Read
    The US aircraft carrier entered the Caribbean

    The US aircraft carrier entered the Caribbean Aircraft carrier USS Gerald R.…

    2 Min Read
Reading: Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
Share
RCO NEWS Daily world news agency Based on Dubai, UAERCO NEWS Daily world news agency Based on Dubai, UAE
Aa
  • Immigration
  • Travel
  • Technology
  • Science
  • Fashion
Search
  • Home
  • Immigration News
    • Canada
  • Technology News
    • Gadgets
    • cryptocurrency
  • Travel News
    • Dubai
  • Fashion News
  • World News
  • Bookmarks
  • Sitemap
Have an existing account? Sign In
Follow US
© 2023 RCO News Network. Studio TEDSA Design Company. All Rights Reserved.
RCO NEWS Daily world news agency Based on Dubai, UAE > Blog > Technology > artificial-intelligence > Learning Reinforcement, Way to Comprehensive Artificial Intelligence?
artificial-intelligence

Learning Reinforcement, Way to Comprehensive Artificial Intelligence?

IT Technology
Last updated: 2025/09/02 at 7:45 AM
IT Technology
Share
SHARE


Contents
When cars became Master of Chess and GORLHF and the role of human in chatting trainingIs it a bridge toward artificial intelligence or mirage?

Hamid Reza Mazandarani, a network and artificial intelligence researcher, has examined the history, hidden role and reinforcement learning challenges in artificial intelligence in an exclusive note written for Digiato.

Reinforcement learning has been a high -profile way over the past few decades, a way that today looks more eye -catching and eye -catching than ever before. But where does this path go and what destination can be expected? The following note takes a brief look at these questions.

Reinforced learning, following the interaction with the environment and receiving appropriate rewards, modifies its parameters. In other words, the data is made up, without the inherent need for label and ready -made educational data. This approach is considered as a complement to conventional learning, especially for decision -making issues that are sometimes unclear in any situation.

Two scientists, Richard Satin and Andrew Barto, founded the scientific framework of reinforcement learning, as we know today, in the late 1980s. Of course, the ideas of those years, in the early twentieth century, were invented by psychologists. You may have heard the name of the famous “Skinner Box” in which the animals were learned to receive food by pressing the lever.

Famous “Skinner Box” test to check the animal’s response to reward (Reference: Forbes)

Later, however, psychologists found that learning is an over -elementary model to describe human and even animals. Its famous example is the phenomenon of “learned helplessness”, whereby living beings under the frustrating conditions do not attempt to maximize rewards, as it expects reinforcement learning.

When cars became Master of Chess and GO

However, the main obstacle to reinforcement in the world of artificial intelligence was from another sex: the need for many interactions with the environment to behave slightly better than a random factor. In the second half of the last decade, a combination of hardware progress, the emergence of deep learning, as well as the provision of more efficient algorithms, has partially eliminated this obstacle. As a result, conditions were provided for Deep Madam to defeat the GO chess champions and play with its smart models. These models came up with millions of games with themselves (called Self-Play).

Now all the evidence suggested that learning to strengthen the star of the sky would be artificial intelligence, but the story went different: the language models that were trained based on the prediction of the text formed a revolution that transformed human life. These days, ChatGT and his competitors have become an integral part of people’s lives around the world, and even talking about improving their ability in the form of “smart agency”.

But what came about reinforcing learning? It is interesting to know that reinforcement learning has also contributed to the evolution of language models. In fact, the problem with the initial language models was that they were not ready to talk to humans. But by teaching these models in the form of reinforcement learning and rewarding their responses, the basis for more consistent models was provided with users’ demands.

RLHF and the role of human in chatting training

In 2017, Deepmand expanded a method that is a RLHF algorithm (human feeder learning) in collaboration with Openai. In the algorithm, human users choose the more useful and safer option between the two answers produced by the language model. With these choices, a reward model is taught that is the basis of the main model training. In a way, the reward model acts as a referee or critic for the language model.

While RLHF makes a reinforcement learning on the original model, scientists were not convinced and developed other ideas that do not require a human user at all. The result was the invention of methods like RLVR (reinforcement learning with verifiable rewards) that reward the language model based on the correct answer. The correct answer can be the output of a piece of programming code or the final answer of a mathematical issue. From now on, whenever your model helps you in coding, remember that the model is not only with the prediction of the text, but by trying to find the correct coding answers.

Now we may be tempted to claim that the artificial intelligence is human or beyond near, because the right rewards can make the models more powerful day by day. In 2021, several researchers (including Richard Satin) presented an article entitled “Sufficient Rewards” that somehow followed the same line of thought. It may be theoretically, but there are serious challenges in practice.

Many human issues, such as managerial counseling, or writing a few lines of poetry, do not have a measurable reward. In response to this challenge, some seek to develop RLAIF (Empowerment Learning with Artificial Intelligence Rewards), which uses artificial intelligence to reward the language model.

Is it a bridge toward artificial intelligence or mirage?

Even if efforts are to build a comprehensive reward model that tells the language model the text it produced exactly how well “, scalability, the old problem of reinforcement learning again; In particular, the current models are equipped with “reasoning”, meaning that they produce several times to reach the final output, which means more resources consumption.

However, will we learn to reinforce our comprehensive artificial intelligence (Agi)? This is a difficult question in several respects. First, many believe that we have nothing called “Comprehensive Artificial Intelligence”. If artificial intelligence is considered at the human level, in some respects, one now has nothing to say to artificial intelligence. If the skills are to achieve homogeneity and equilibrium among the skills? So until the destination is precisely clear, it is meaningless to measure the distance with it.

Another challenge is that the research and development process is evolving without a single thinker. Dipmand has been criticized after the advent of language models, gambling on reinforcement learning; And if history had been repeated, it would never have invested in this area, and we would have been deprived of its progress. So the question of what to do depends on the decisions of researchers and investors, not the inherent capabilities of technologies!

Finally, it should not be forgotten that research has always been able to surprise us and may emerge a new technology, or an old idea will re -live a new life and abandon the reinforcement learning (or better reinforce it!).

Dubai company setup

Latest Passing over countries : Spain | Dominica | United Arab Emirates



RCO NEWS

RCO News

We are glad that you have decided to subscribe to our email list.
Please take a few seconds to fill out the listing details to join our list.
You will receive an email to confirm your subscription, just to make sure it is your email address.
TAGGED: artificial, comprehensive, intelligence, learning, reinforcement
IT Technology September 2, 2025 September 2, 2025
Share This Article
Twitter Email Copy Link Print
What is your Reaction in this page of RCO News?
Love0
Sad0
Happy0
Sleepy0
Angry0
Surprise0
Previous Article sui-eyes-3-90-as-supply-shrinks-and-bulls-defend-crucial-support Reducing millions of supply and increased purchasing pressure; The price is ready to jump again?
Next Article Yemen’s message to Saudi Arabia: Nothing remains hidden from our eyes – Mehr News Agency | Iranian and world news
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

UAE immigration
Dominica immigration
Spain immigration

Latest Passing over countries : Spain | Dominica | United Arab Emirates

Editor's Pick

ca, the safest cities in Canada in 2025

British Columbia cities at the bottom of the security tableprovince British Columbia It has had the weakest performance in this…

By Editor-in-chief of Canada 3 Min Read
Thieves attacked the Syrian National Museum; Roman statues were stolen!

According to the Associated Press; Thieves broke into the Syrian National Museum…

2 Min Read
Architectural style of Damac Dubai properties
Damak Dubai real estate architecture (descriptions + features)

The features and architecture of DAMAC Dubai properties are a combination of…

9 Min Read

Top Writers

Editor-in-chief of Canada 810 Articles
We at Canada RCO News Observatory are responsible for gathering…
Editor-in-chief of Canada
TakeOff 4510 Articles
We at RCO NEWS for Travelers of the Takeoff travel…
TakeOff

Oponion

What is the difference between sleeper and crux + photo

Many people hesitate when buying comfortable slippers between two popular…

November 2, 2025

You Might Also Like

Wikipedia's request from AI companies
artificial-intelligence

Stop free data mining

The Wikimedia Foundation, the non-profit organization that hosts Wikipedia, has called on artificial intelligence companies to stop free-scraping data from…

3 Min Read
Ershadi Kiarostami Mehrjooi
artificial-intelligence

Homayun Ershadi, the main character of Kiarostami’s Taste of Cherry, died at the age of 78

Homayoun Ershadi, an Iranian actor who is best known for his roles in Abbas Kiarostami's "Taste of Cherry" and Dariush…

4 Min Read
Jensen Huang
artificial-intelligence

Artificial intelligence will create more millionaires than the internet

Jensen Huang, the CEO of Nvidia, said in his latest interview that artificial intelligence will create more millionaires in the…

2 Min Read
Science

Analysis of biological data and pathology images with artificial intelligence – RCO News Agency

According to Mehr news agency, quoting Tehran University, a group of researchers from Tehran University's technical schools, headed by Ali…

3 Min Read
RCO News Logo Light RCO News Logo

Other News

  • Science
  • Fashion
  • Business

Technology

  • Technology News
  • artificial-intelligence
  • cryptocurrency
  • Gadgets

Immigration

  • Immigration News
  • Canada

Travel

  • Travel News
  • Dubai

More

  • Advertise
  • Contact

Subscribe

  • Dubai Company
  • TEDSA HOLDING
  • Nobel Cert Universal

© RCO News Network. By Studio TEDSA HOLDING. All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?