Google plans to combine its two powerful artificial intelligence models, Gemini and VEO.
Demis Hasbis, CEO of Dipmand, said Google plans to combine Gemini artificial intelligence models with VEO video models in the future to improve the ability to understand the physical world in these systems.
“We have designed Gemini as a multifaceted model, because our goal was to build a world digital assistant; A smart assistant who can really help you in the real world. “
While the artificial intelligence industry is moving towards the development of all -key models, models capable of understanding and producing different types of content such as text, image, audio and video, Google is also trying to expand its advanced models.
The new versions of the Gemini model are now capable of producing audio, image and text, while the default Openai model in chat Chat is also capable of producing image (including artwork with studio style). Amazon has also announced that it will unveil a “Any-to-any” model by the end of this year.
These comprehensive models require a huge amount of different data for training; Including image, video, audio and text; According to Hassbis, the VEO video model mainly uses YouTube videos to learn the real -world rules. “VEO can recognize real -world physics by watching a lot of videos on YouTube,” he said.
Google had earlier said that its models were “possible” based on “some” of YouTube content and under the agreement with its creators. Reports also show that the company changed its service conditions last year to use more data to teach its artificial intelligence models.
RCO NEWS