Google has created a new artificial intelligence model called RT-2 that can help robots better understand visual and linguistic patterns. With the help of this model, robots understand what they see and what is said to them. RT-2 will be able to recognize patterns in images and language more accurately by relying on the previous vision-language-action (VLA) model.
According to Hoshio, Google has made a significant leap in increasing the intelligence of its robots by introducing the Robotic Transformer (RT-2), which is an advanced artificial intelligence learning model. Based on this, robots equipped with RT-2 technology can better understand instructions and select appropriate objects to perform specific tasks.
The researchers conducted recent experiments using the RT-2 and a robot equipped with a robotic arm in a simulated kitchen environment. In this experiment, the robot was asked to perform various tasks such as identifying a hammer, choosing a drink for a tired person, and moving a can of Coca-Cola. The robot equipped with RT-2 was able to perform all these tasks correctly.
To develop RT-2, Google trained this model using a combination of Internet and robotics data and used an advanced language model called Bard, which is Google’s proprietary language model, to help RT-2 understand language. In addition, RT-2 is also adept at understanding instructions in languages other than English, which represents a significant advance in understanding and interpreting the languages of different countries for AI-based robots.
Before the advent of VLA models like the RT-2, training robots was a long and difficult process, requiring painstaking and time-consuming programming to perform each specific task. With the power of these advanced models, robots now have access to a vast repository of data that enables them to make informed decisions on the spot, simplifying their training process.
Google’s effort to develop smarter robots started last year by integrating its language model called LLM PalM into robotics, which eventually led to the development of a system called PaLM-SayCan. Google wanted to use this system for its robots in order to make it easier to understand language and work with objects in the real world, and this was an important step in line with the current developments at Google.
However, the new robot is not without flaws. During a live demonstration shown by The New York Times, the robot had trouble correctly distinguishing different flavors of soda, sometimes mistaking fruits for white. Such drawbacks highlight ongoing challenges in refining AI technology for real-world applications.
RCO NEWS