Text to video generator: OpenAI working on new AI model Sora

OpenAI says that Sora, its new AI model, is also capable to to animate still images

Alexander Lewis - Feb 16, 2024

OpenAI announced on Thursday that it is working on a new artificial intelligence (AI) model, called Sora, that is capable of generating videos with just text prompts.

In a post shared on X, formerly known as Twitter, the company maintained that Sora, its new generative AI tool, can "create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions."

While citing sources familiar with the matter, Reuters said the company is "working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who are adversarially testing the model."

Not only is Sora able to generate videos with text prompts, but can also recognise videos generated by itself.

The probable reasoning behind the tool being tested by "red teamers" is that the GPT maker aims to ensure that the tool — after being asked by a user — does not indulge in composing videos that entail any element of hate, prejudice, disinformation, misinformation, or any sort of content that might invoke watcher's sentiments.

It was also learned that OpenAI has made Sora exclusively available to red teamers to diagnose errors in the AI system, and to visual artists, designers and filmmakers in order to get feedback on the functionality and performance of the AI model.

"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background," the company said in a blogpost, adding that it can create multiple shots in a single video. Moreover, the company also underscored Sora's ability to animate still images.

The generative AI tool is based on OpenAI's ChatGPT chatbot which was released in 2022 and took the world by storm by remarkably writing poems, emails, and codes for software programmes.

Company's official blogpost added that it's a work-in-progress tool which is currently mistaking the spatial details of a prompt and facing difficulty in following the user-demanded camera trajectory.