Just moments ago, OpenAI’s Sora made its official debut.
The launch event maintained the fast-paced “short drama” style, lasting about 20 minutes, hosted by CEO Sam Altman, Sora lead Bill Peebles, and others.
OpenAI announced on X that since February, they’ve been developing Sora Turbo, a significantly faster model version, which is now available as a standalone product for Plus and Pro users.
Interestingly, due to Sora’s overwhelming popularity, the website crashed from the surge of users trying to access it, temporarily halting registration and login. Altman took to X to reassure users:
“Due to unprecedented demand, we’ll need to periodically pause new user registrations, and content generation will be slower for a while. We’re working at full capacity!”
Access Sora at: Sora.com
Sora Interface Revealed: 6 Game-Changing Features – Is Video Editing Becoming Obsolete?
Similar to Midjourney’s web interface, Sora has its own dedicated user interface where users can organize and browse generated videos, as well as view other users’ prompts and featured content.
The “Library” feature allows users to save favorite or useful prompts for future use. Saved prompts can be viewed or modified as needed, significantly improving efficiency for users who frequently create similar content.
In terms of workflow, Sora’s editing capabilities set it apart from competitors.
For instance, the Remix feature allows users to edit videos using pure natural language prompts, with simple “strength” options and sliders to control the degree of changes.
The Re-cut feature intelligently identifies the best shots and supports extending scenes in any direction.
Sora’s Storyboard feature functions like a video editor, connecting multiple prompts to create longer videos, easily handling complex multi-step scenes.
Combined with Loop and Blend features, users can create seamless looping videos and perfectly merge different clips, while Style presets allow for preset and adjustable generation styles.
Technically, Sora supports 5-20 second video generation and is compatible with mainstream aspect ratios like 1:1 and 9:16. Generation speed has improved significantly compared to early versions.
Several important details to note:
OpenAI has implemented a flexible credit-based pricing strategy, with credit amounts varying by resolution and duration. ChatGPT Plus and Pro members can use the service without additional costs.
For example, generating a 480p, 5s video requires 25 credits, while a 480p, 20s video needs 150 credits.
Additionally, using Re-cut, Remix, Blend, or Loop features for content exceeding 5 seconds will cost extra credits – the more you use, the more you pay, and overtime usage incurs additional charges.
For subscribers, the $20 ChatGPT Plus plan offers 50 priority video allocations (1000 credits), supporting up to 720p resolution and 5-second duration.
The $200 ChatGPT Pro plan provides up to 500 priority videos (10000 credits), supporting 1080p resolution, 20-second duration, 5 concurrent generations, and watermark-free output.
OpenAI is developing different pricing models for various user types, to be launched early next year.
Note that Sora doesn’t currently support ChatGPT Team, Enterprise, or Edu versions, and isn’t available to users under 18. Currently, Sora can be accessed wherever ChatGPT is available, except in the UK, Switzerland, and EU regions.
Hands-on Testing Reveals Sora’s Limitations, but Some Use Cases Rival Professional Quality
Popular YouTuber Marques Brownlee got early access to Sora a week ago and shared his experience.
He pointed out several limitations of the product.
Regarding physics simulation, the model’s understanding of object movement isn’t fully developed, often resulting in unnatural motions and objects suddenly disappearing. Particularly with leg movements, there are frequent issues with front and back leg positioning, leading to unnatural movements.
Sometimes, generated videos appear to be in slow motion while other parts play at normal speed, creating noticeable inconsistencies. In short, Sora still struggles with understanding physical world rules.
Additionally, Sora hasn’t solved text generation issues, often producing jumbled text, though it excels at editing styles, text scrolling animations, and news anchor-style generation.
However, Sora shines in certain scenarios.
For instance, Sora excels in landscape shots, generating drone footage comparable to professional content, and performs reasonably well with cartoon and stop-motion animation styles.
Performance-wise, a 5-second 360p video typically generates within 20 seconds.
However, 1080p videos or complex prompts might take several minutes to generate, and with the current influx of users, generation speeds have noticeably slowed.
Many users have tried Sora immediately upon release. User @bennash attempted to generate a video that didn’t complete after 22 minutes of rendering, and the website temporarily stopped accepting new registrations.
Content creator @nickfloats noted that while Sora doesn’t preserve certain visual effects when converting images to video, the overall conversion quality is “clear and satisfactory.”
Can Sora Become OpenAI’s Next “Golden Goose”?
The Sora system card highlights several noteworthy details.
OpenAI believes Sora provides a foundation for models that understand and simulate the real world, marking a significant milestone toward achieving Artificial General Intelligence (AGI).
The official blog explains that Sora is a diffusion model that starts with a video resembling static noise and gradually removes noise to create the final video. By processing multiple frames simultaneously, the model solves a crucial challenge: maintaining consistency of objects even when they temporarily leave view.
Like GPT models, Sora uses a Transformer architecture.
Sora utilizes DALL·E 3’s annotation technology to generate highly descriptive labels for visual training data, enabling more accurate video generation based on user text instructions.
Beyond generating videos from text prompts, Sora can create videos from static images, accurately animating image content with attention to detail. The model can also extend or fill missing frames from existing videos.
To ensure safe deployment, OpenAI has enhanced Sora’s safety measures based on DALL·E’s experience with ChatGPT and API deployment, along with other OpenAI product safeguards.
- Prohibits using others’ likenesses without permission and depicting real minors
- Prevents creating illegal content or infringing on intellectual property
- Bans harmful content, including non-consensual intimate images, bullying, harassment, defamation, or content intended to spread violence, hate, or cause distress
- Restricts creating and spreading content for fraud, scams, or deception
All Sora-generated videos include C2PA metadata, identifying them as Sora-generated for transparency and source verification.
Unlike Flux, which gained popularity through realistic human portraits, Sora has implemented strict review standards for content containing people, currently offering this as a pilot feature to select early testers and blocking content containing nudity.
Six months ago, Sora’s initial debut received widespread acclaim.
However, if a year ago we could still marvel at demo videos claiming “reality no longer exists,” today’s audiences, seasoned by various video models both domestic and international, are harder to impress with similar products.
This shift in attitude stems from a simple fact.
As AI evolves from “barely usable” to “highly capable,” user expectations have also evolved from “can it be done” to “how well can it be done.”
Fortunately, Sora hasn’t rested on its laurels. Through deep collaboration with artists, they’ve made significant improvements in workflow. Features like Re-cut, Remix, and Storyboard are notably practical.
The existence of client-contractor relationships means communication will always be essential in workflows. AI’s role is to make this communication more efficient. Sora’s value lies not in what it can do, but in freeing creators from technical details to focus on creativity’s essence.
Meanwhile, last week’s controversial $200 ChatGPT Pro subscription plan now has a more reasonable price anchor, offering unlimited access to Sora. This product synergy is expected to spark applications and commercial value far exceeding expectations.
Looking at the present, users’ willingness to pay never lies.
With Keli AI reporting monthly revenue in the tens of millions, this blue ocean’s potential is evident. For OpenAI, still in its “money-burning” phase, Sora is expected to become another golden goose following ChatGPT.
As Sora evolves from “usable” to “good” to “excellent,” we might one day discover that what’s truly limitless is not reality, but human creativity.