Nvidia has entered the field of World Models, artificial intelligence models that take inspiration from human mental models to understand and predict the world. At CES 2025, the company introduced a family of world models called Cosmos World Foundation Models, or Cosmos WFMs for short, that can generate and predict physics-aware videos.
Details of Nvidia’s Cosmos WFMs models
These models, which can be tuned for specific applications, are available through the Nvidia API, the NGC catalog, and the Hugging Face AI developer platform. In a blog post, Nvidia announced that Cosmos WFM models are available for physics-based simulation and synthetic data generation. Researchers and developers, regardless of the size of their company, can take advantage of Cosmos models for free under Nvidia’s open license that allows commercial use.
Classification and dimensions of Nvidia’s global models
The Cosmos WFM family includes three Nano models, for low-latency and real-time applications, the Super model as standard high-performance models, and the Ultra model for high-quality outputs and maximum accuracy.
All these models vary in size between 4 billion and 14 billion parameters. Models with more parameters are usually capable of solving complex problems faster.
Nvidia has also released models for enhancing video resolution, generating sensor data for self-driving cars, as well as models for responsible use. These models are trained using 9 trillion tokens including 20 million hours of real data from human interactions, environment, industries, robotics and driving data.
Margins of using data in these models
Nvidia did not disclose where the data was collected. However, there have been reports and complaints that the company has used copyrighted YouTube videos without permission. One of Nvidia’s spokespersons said in response to these accusations:
“Cosmos is not designed to copy or infringe copyrighted works. “These models learn like humans, and we are confident that our use of data is consistent with the spirit and letter of the rules.”
Applications of Cosmos WFMs
Cosmos WFMs models can generate controlled and high-quality synthetic data by receiving inputs such as text or video frames. This data can be used to train artificial intelligence models in areas such as robotics, self-driving cars, and more.
In its blog, Nvidia announced that Cosmos WFM models are specifically designed for physical AI research and development and can generate physics-based videos with a combination of inputs such as text, image, video and sensor data.
Companies such as Waabi, Wayve, Fortellix, and Uber have committed to testing Cosmos WFMs for use cases ranging from video search and sorting to developing artificial intelligence models for self-driving cars.
Uber CEO Dara Khosrowshahi said about these artificial intelligence models:
“Generative AI is shaping the future of the transportation industry and requires rich data and extremely powerful computing power. “Working with Nvidia, we are confident we can accelerate the timeline for delivering safe and scalable self-driving solutions.”
An important point about models being open source
Although Nvidia describes Cosmos WFM models as open source, these models are not open source in the true sense of the word. Open source means providing enough information about the model’s design so that anyone can reconstruct the model, and disclosing details about its training data, including the source and how the data was obtained or licensed.
Nvidia has not released the full details of the Cosmos WFM model training data, nor has it made available the necessary tools to rebuild these models. For this reason, these models are introduced as open rather than open source.
RCO NEWS