Summarize

This AI tool can create 3D videos from single images

By Akash Pandey

Mar 19, 2024

04:59 pm

What's the story

Stability AI has launched an innovative artificial intelligence (AI) model, Stable Video 3D (SV3D). This unique model is engineered to create orbital videos from single image inputs.

Unlike its counterparts such as OpenAI's Sora, Runway AI, and Pika 1.0, SV3D doesn't rely on text inputs. Instead, it converts a flat image into a dynamic 3D model.

The company has made this model accessible for both commercial and non-commercial applications.

Major breakthrough

It offers multi-view capabilities

Stability AI announced the debut of Stable Video 3D on X, emphasizing that this new model marks a substantial leap forward in the realm of 3D technology.

The model, which is built upon Stable Video Diffusion principles, boasts enhanced quality and multi-view capabilities.

This exciting news comes hot on the heels of the AI firm's release of Stable Diffusion 3 just a month ago, which was developed to boost performance in multi-subject prompts.

What's more?

Two editions tailored to diverse user requirements

Stability AI presents the innovative SV3D model in two different editions: SV3D_u and SV3D_p.

The former, SV3D_u, creates orbital videos from single image inputs without needing camera conditioning.

In contrast, the more sophisticated edition, SV3D_p, supports both single images and orbital views. This allows it to generate fully realized 3D videos along predetermined camera paths.

Twitter Post

Take a look at Stability AI's post

Today, we are releasing Stable Video 3D, a generative model based on Stable Video Diffusion. This new model advances the field of 3D technology, delivering greatly improved quality and multi-view.

The model is available now for commercial and non-commercial use with a Stability… pic.twitter.com/cc5T9RRH3L — Stability AI (@StabilityAI) March 18, 2024

Insights

Overcoming limitations of earlier models

The SV3D model effectively tackles the inconsistency issues that plagued earlier models like Stable Zero123.

It leverages Neural Radiance Fields (NeRF) and mesh representations to boost the quality and consistency of the rendered videos.

To mitigate baked-in lighting problems, SV3D utilizes a disentangled illumination model that is optimized in conjunction with 3D shape and texture.