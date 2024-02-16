Microsoft-backed OpenAI recently introduced Sora, a state-of-the-art text-to-video model, as its newest invention to remain ahead of competitors in the field. This decision demonstrates OpenAI's dedication to preserving a competitive advantage in the quickly developing artificial intelligence (AI) market amid a setting in which text-to-video solutions have grown in popularity.

What is Sora?

The new model, called Sora (from the Japanese word for "sky"), can generate lifelike videos up to one minute in length that comply with user-specified subject matter and style guidelines. A blog post from the business claims that the model can also add new content to already-existing videos or make a video from a still image.

The company wrote in a blog post, "We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction."

The company announced that a small number of researchers and video creators now have access to Sora. Researchers, graphic artists, and filmmakers are the only ones with limited access, however, CEO Sam Altman replied to user requests on Twitter following the announcement by sharing video clips he said were created by Sora. To prove that they were created by AI, the videos have a watermark on them.

How does OpenAI work?

According to The Guardian, Picture a TV screen that is noisy and covered with static, and then gradually reduce the fuzziness until you are left with a clear, moving film. In essence, it is what Sora does. It's a unique program that gradually reduces noise and produces videos by using "transformer architecture".



Not merely frame by frame, but full films can be produced by it all at once. Users can direct the content of the video by supplying text descriptions to the model, such as ensuring that a person remains visible even if they briefly walk off-screen.

Consider GPT models that use word counts to generate text. Sora uses pictures and videos to accomplish a similar task. It divides films into smaller segments known as patches. On the other hand, the type of data used to train the model has not been disclosed by the company.

Sora's weakness

The company admitted that the present model has weaknesses in the blog post. The model might have trouble accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect, according to the statement.

