Katana VentraIP

Text-to-video model

A text-to-video model is a machine learning model that takes a natural language description as input and produces a video relevant to the input text.[1] Recent advancements in generating high-quality, text-conditioned videos have largely been driven by the development of video diffusion models.[2]

Text-to-image model

unreleased Google's model, precursor of Lumiere

VideoPoet

unreleased OpenAI model

Sora

the company developing Gen-1 and Gen-2 models

Runway