After a year’s wait, Sora finally arrived – but OpenAI made users work for it.
Despite promises of official access, only the lucky few with quick reflexes could get early experience. After a day of constant refreshing and facing countless “Please check back later” messages, registration finally succeeded.
However, the same $20 that lets me discuss poetry and philosophy with ChatGPT only allows Sora to generate 40 five-second 480p videos in a month. Each generation feels like walking on thin ice.
Now that we have access, we naturally want to make every Sora video count by comparing it with Runway and Capcut AI. The verdict? It’s usable and fun, but hard to give unreserved praise.
Sora, Runway, and Capcut AI: Each Fails in Their Own Amusing Way
Comparing Sora with Runway and Capcut AI makes sense – one is an established pioneer in AI video overseas, while the other is a rising star from China that has won over users worldwide with its capabilities.
The rules are simple: use the same English prompts (translated here for readability) to generate 5-second videos. The only difference is that Sora’s resolution is set to 480p. We have to be economical with those credits.
Let’s start with text-to-video, comparing realism and texture quality by seeing how these AI tools generate cats.
Despite being only 480p, Sora’s output looks HD with beautiful color grading.
▲Sora generation, prompt: A British Shorthair cat on a balcony captured with a 200mm telephoto lens, showing clear fur detail. Large potted plant in foreground with slight bokeh, tree leaves swaying in background. Scene has film grain texture and color saturation, HD quality
Runway and Capcut AI perform similarly, with Capcut AI having the most accurate foreground and background generation. Three cats, three different coat patterns.
▲Runway generation
▲Capcut AI generation
Next, let’s have the AI videos “write” and see if they can produce “APPSO”.
Sora’s hand movements look natural, but the written lines seem to have a mind of their own.
▲Sora generation, prompt: Overhead view of a hand writing “APPSO” on white sketch paper, black strokes, fluid writing motion, natural hand movement, soft lighting, close-up shot
Runway gets closest but still isn’t perfect, and except for the final stroke, the letter paths don’t coordinate well with hand movements.
▲Runway generation
As for Capcut AI, it produces gibberish, but impressively, the letter strokes follow the hand movements.
▲Capcut AI generation
Let’s test movement fluidity with a mountain bike race. Sora’s camera work and motion trajectories fully follow the prompt, with realistic shadows.
▲Sora generation, prompt: Mountain biker speeds through consecutive dirt track jumps, becomes airborne off the final ramp, side angle captures the peak moment
Runway only gets half the prompt right – the cyclist doesn’t appear at the start but delivers a highlight shot at the end.
▲Runway generation
Capcut AI is the opposite of Runway – strong start but problematic ending. Why did an extra person appear?
▲Capcut AI generation
Time to increase difficulty with a more complex prompt involving camera transitions.
Sora’s scene has saturated colors like it’s color-graded, but the male character appears from nowhere, and the AI doesn’t follow the instruction to pan to him.
▲Sora generation, prompt: On a sunny afternoon in a Starbucks-style café, camera first focuses on a smiling young Chinese woman, then pans to a young Chinese man nodding as he speaks. They sit across from each other with two coffee cups on the wooden table. Natural light fills the space, creating a warm atmosphere
Runway shoots directly from the side, capturing both characters’ expressions but missing the camera movement, and the male’s hands have issues.
▲Runway generation
Capcut AI is similar to Runway but slightly better, looking more authentically Chinese. However, the two people at the same table never look at each other.
▲Capcut AI generation
Besides text-to-video, image-to-video is also a major feature, and it’s more practical. Many commercial AI videos are primarily image-to-video, aiming for consistency at the image stage first.
However, $20 Plus users can’t upload photos or videos containing people to Sora. As an alternative, we uploaded a wizard cat meme, asking the cat to wave its magic wand and create a rose.
For some reason, Sora’s image-to-video doesn’t work – the cat barely moves, only the logo in the bottom right shows it’s not a still image.
▲Sora generation, prompt: The cat waves its magic wand and creates a red rose
Runway has the cat wave the wand with its right paw and create a rose with its left, following the prompt, but the flower seems to be on a different layer.
▲Runway generation
Capcut AI’s performance is perfect, with the most natural effect – it could be a new meme gif.
▲Capcut AI generation
After trying animals, let’s test empty scenes. I used an AI-generated industrial wasteland image as source material.
Sora’s result is hard to evaluate – the angle is low enough, but the camera isn’t tracking from the side, and the scene transitions are abrupt. It’s like I’m under the car rather than inside it.
▲Sora generation, prompt: Armored vehicle passes by, tires kick up dust and debris, side tracking shot, low angle perspective, slow motion, cinematic quality
Runway’s generation feels most authentic, even animating the car windows.
▲Runway generation
Capcut AI pulls the camera way back, barely following the prompt.
▲Capcut AI generation
None of the three AIs scored perfectly on these tests. Of course, these are just individual cases and not broadly representative, just offering one perspective for evaluation.
Regarding Sora specifically, it performs well in text-to-video realism with a cinematic quality, and its adherence to movement prompts is decent, sometimes better than Capcut AI and Runway.
But image-to-video often misses the mark, either remaining static or ignoring camera movement instructions, making the overall value proposition questionable.
▲Sora generation, prompt: A 35mm film short shot in Shanghai in the 1990s, cinematic quality
“Basic Version” Model, Innovative Product
Sora’s average performance might be because we’re using a “basic version” – unlike the artists invited by OpenAI, we’re using the turbo version that requires less computing power and consequently delivers reduced effects.
Where the model falls short, the product makes up for it – announced in February and released in December, despite many competitors emerging, Sora still has features they don’t.
Unlike ChatGPT’s one-size-fits-all chat interface, Sora shows creativity in its interface design and product features.
Sora’s storyboard function, similar to keyframes but more flexible, allows us to add multiple cards on a timeline. Cards can contain prompts, images, or videos, and Sora generates complete videos between the cards.
So I wrote two prompts: 1. J-drama style shot, high school girl leaning against rooftop railing, profile composition, soft afternoon light on her face; 2. She turns to face the camera with a smile, warm lighting emphasizing her expression.
The generated result matches my imagination, with hair movement that’s irresistibly charming.
▲Sora generation
AI can’t yet make everyone a director, but Sora lets you experience storyboard design. However, the model’s current state means results are very random, and Sora’s credits don’t allow for much experimentation.
I wanted AI to mimic game CG effects with the protagonist quickly turning and drawing a gun, but ended up with a robot with a blank expression.
▲Sora generation
You can also just put one image on the storyboard, and Sora will automatically generate prompts suggesting how to animate your image.
This finally got the wizard cat moving. Apparently, this is how image-to-video limitations are meant to be overcome. However, the results can still be awkward, sometimes generating unnecessary elements.
▲Sora generation
Additionally, Sora’s Remix feature is fun, allowing video editing through natural language, changing video elements for “derivative creations.”
You can use your own videos or borrow from Sora’s community.
▲Image from: Sora community @bpyser1
For example, we can change the dancing paper figure to a boy band and switch the scene to a practice room.
The paper figure’s movements and clothing are roughly preserved, but the character’s limbs still don’t hold up to close inspection.
▲Sora generation
Even more interesting, we can use the Blend feature to merge two videos, with Sora automatically handling the transitions.
I thought we might get a smooth MV segment, since the two videos were so similar, but AI still surprised me – while the beginning and end are normal, the middle becomes chaotic. How many people are there anyway?
▲Sora generation
In conclusion, if you’re not focused on output quality, Sora is fun with interesting product features, offering a new workflow with innovation and comprehensive functionality.
However, looking at the current state, there’s significant room for improvement in generation quality, but users get limited attempts – $20 only lets you scratch the surface. Sometimes the visuals are beautiful but poor motion handling ruins it – “reality has ceased to exist” remains a distant dream.
Please enjoy this cat passing through a wall – apparently in AI’s eyes, cats really are liquid.
▲Sora generation, prompt: Cinematic close-up of a black cat, elegantly leaping in front of red palace walls in the Forbidden City, shown in slow motion with the entire cat clearly visible, background blurred with shallow depth of field, golden eyes looking directly at camera at the peak of the jump. Using soft natural lighting, traditional Chinese architectural wall details form a blurred background
Sora’s issues are common among many AI video products – there’s no truly reliable one-shot solution. Simulating the real world? Achieving smooth motion? Maintaining character consistency? It’s possible, but probability-based, requiring multiple attempts and post-processing.
What we’re seeing now is the visible generation quality, while AI videos are collectively changing the way content is created. The future looks promising, but Sora, please upgrade your model first.