Meta introduces two new AI-based video editing features

Emu Video and Emu Edit will help in the posting on Instagram or Facebook
A representational image of an editor editing pictures and video. — Canva
A representational image of an editor editing pictures and video. — Canva

If you post content on Facebook and Instagram quite often and face difficulty at the same time, editing pictures and videos, Meta Platforms has sorted your problem by introducing two new Al-based features — Emu Video and Emu Edit.

These two AI-based features — according to details shared by Reuters — will help in the posting on Instagram or Facebook.

The first feature is called Emu Video and it generates four-second-long videos with a prompt of a caption, photo or an image, paired with a description.

The second feature, Emu Edit, allows users to alter or edit videos with text prompts more easily.

Emu underpins a generative AI technology and AI image editing tools for Instagram that lets one take a photo and change its visual style or background.

Businesses and enterprises in the last year have flocked to the nascent generative AI market for newer capabilities and refining business processes since the launch of OpenAI's ChatGPT late last year.

The social media giant has been making rapid strides in the AI universe. He has become one of its most significant focus points as it looks to compete with other giants such as Microsoft, Alphabet's Google and Amazon.

Functions of the two features:

Emu Edit

According to a statement released by Meta, Emu Edit is capable of free-form editing through instructions, encompassing tasks such as local and global editing, removing and adding a background, colour and geometry transformations, detection and segmentation, and more. 

"Current methods often lean towards either over-modifying or under-performing on various editing tasks. We argue that the primary objective shouldn’t just be about producing a “believable” image. Instead, the model should focus on precisely altering only the pixels relevant to the edit request.," the statement read.

Unlike many generative AI models today, Emu Edit precisely follows instructions, ensuring that pixels in the input image unrelated to the instructions remain untouched. 

Emu Video

With Emu Video, which leverages our Emu model, "we present a simple method for text-to-video generation based on diffusion models." 

This is a unified architecture for video generation tasks that can respond to a variety of inputs: text only, image only, and both text and image. 

Meta stated: "We’ve split the process into two steps: first, generating images conditioned on a text prompt, and then generating video conditioned on both the text and the generated image.

"This “factorised” or split approach to video generation lets us train video generation models efficiently. We show that factorised video generation can be implemented via a single diffusion model. We present critical design decisions, like adjusting noise schedules for video diffusion, and multi-stage training that allows us to directly generate higher-resolution videos."