On February 15th, US time, OpenAI unveiled ‘Sora’, a new generative AI that can create videos up to one minute long from simple text prompts. However, the release date for this highly anticipated new system from OpenAI has not yet been determined.
The following video was actually generated by Sora.
With just one sentence of prompting, they generate such high quality video.
This article will provide an overview of Sora, its features and problems. Please read to the end.
What is OpenAI’s Sora?
Sora is a video generation AI model released by OpenAI on February 15, 2024.
Although not available to the public at this time (as of February 2024), it will eventually be made available to general users.
Sora is capable of generating much higher quality videos than previous video generation AIs, and can create videos up to one minute in length. The following video, which was actually generated, shows that it has reached a level where it is indistinguishable from live-action video.
However, at this stage, due to challenges in physical simulation and the prevention of misinformation spread, the general release of Sora is on hold until appropriate safety measures are in place. OpenAI is collaborating with expert teams to ensure Sora’s safety and plans to develop tools for detecting generated videos.
Sora represents a significant milestone in the advancement of AI technology, marking a step toward the ultimate goal of achieving AGI (Artificial General Intelligence).
Source: OpenAI ‘Creating video from text
What can Sora do?
Sora, released by OpenAI, does more than just generate video from text.
Let’s take a look at what kind of functions it has.
Text-to-Video
First is the Text-to-Video function. This function allows users to generate a video simply by giving text instructions.
Until now, conventional text-to-video video generation AIs could only generate videos of a few dozen seconds at most. Sora, however, can generate videos up to one minute in length, and the quality is so good that it can be mistaken for live-action video.
For example, Sora can generate one-minute videos like the following
Although there are some discrepancies, such as the Japanese in the background, it is hard to tell the difference from the actual video shot.
If you can generate a high-quality one-minute video with simple text instructions, you will be able to use it to create short videos for posting on TikTok, short videos for advertisements, etc.
Image-to-Video
Sora supports not only text input, but also image input. In other words, it can animate images.
For example, the following images can be animated and processed.
If ChatGPT’s image generation function can create an arbitrary image and animate that image, the range of applications is likely to expand dramatically.
Video-to-Video
As with the previous image, Sora also allows video input.
For example, the original video below can be changed to an underwater world view.
Original Video
Converting to an underwater world view
As shown above, Sora allows various edits to be made to the original video.
Normally, video editing as described above would be extremely difficult, and even if it could be done, the editing work would take an enormous amount of time and cost. Simply throwing the entire process to Generation AI will reduce the editing process, and will likely reduce the workload of those who are involved in video production.
Image Generation
Sora can also generate high-quality images. It can generate images with a resolution of up to 2048 x 2048, and can even produce “people images that look just like photographs,” as shown below.
It generates images with a level of quality that is not recognizable as photographs.
Incidentally, the current paid version of ChatGPT uses an image generation AI called “DALL-E 3. Since Sora is also a service developed by OpenAI, the same company that operates ChatGPT, the quality of image generation at ChatGPT is expected to be further improved.
Sora’s Challenges
Although Sora is capable of generating high-quality videos, there are some challenges.
For example, the AI does not fully understand physics, so expressions such as “glass breaking” as shown below were difficult.
Another problem is that “in scenes that include a large number of entities such as people or animals, these entities may suddenly appear from unnatural locations.
The ability to generate high-quality video also carries the risk of “realistic video being misused,” and OpenAI is currently continuing its research to remedy the problem.
When the system is ready, the general public will be able to use Sora.
Summary
In this video, we discussed Sora, a video generation AI released by OpenAI.
Since the release of ChatGPT at the end of 2022, generative AI has undergone remarkable evolution. With more and more new services being released, the era in which AI is commonplace is already just around the corner.
To live successfully in the age of AI, let’s learn more and more about generative AI and use it more and more in our business and personal lives.
コメント