Tence Chinese Company from a new artificial ielligence model called Hunyuanworld-voyager It has unveiled that it can turn a photo io 3D videos.
According to reports, the new model allows users to determine the camera’s direction and move on the virtual scenes produced based on the photo. This model simultaneously produces video and depth data, and without the need for traditional modeling tools allows for 3D models.
Of course, the results provided by this model are not exactly 3D models, but are two -dimensional videos that simulate the moveme of the camera in a 3D environme by maiaining space consistency. It also produces only 49 frames (about two seconds of video) each time, but multiple clips can be connected and made for a few minutes.
The input of this model of artificial ielligence is just an image and path of the camera. Moves such as the face, rear, rotation or movemes are also adjustable by its ierface.
Tence says this new artificial ielligence model is trained with over 100,000 video clips, including real scenes and Unreal Engine renderings. These data are automatically processed by software that calculates the camera’s moveme and depth of each frame.
Limits of Model of Tence Artificial Ielligence
However, architectural constrais Transformer It allows the model to simulate the patterns seen in educational data and make error in completely new situations. For this reason, Voyager is impaired in producing 360 -degree rotation.

In terms of performance, Voyager has the highest score of 77.62 at the Stanford University benchmark. The model has a brillia performance in the corol of objects, light compatibility and output quality, but came in second in corol of the camera after Wonderworld.
It also requires high hardware power to run the model, as it requires at least 60 GB of graphics memory for the 540p output. Tence has already released the differe weights of the model in the Hugging Face and has made the code available for implemeation.



