Founder’s notebook

    8-second AI videos are a lie. Here's the math.

    Every Sora / Veo / Runway launch screams 8-second demos. I'll show you why that format is useless for small businesses — and what actually wins.

    Idan Biton
    Founder, Sarra · March 28, 2026 · 6 min read

    Open Twitter any given Tuesday. There's a new AI video model launch. The hero demo is a cinematic 8-second clip of a dog skateboarding down a futuristic street. It has 40,000 likes. The founder replies to every comment.

    Then Thursday rolls around and a small-business owner — let's call her Pnina — signs up. She generates her first 8-second clip. It's, honestly, fine. She posts it to TikTok. It gets 37 views.

    Every AI video launch sells you the dream of the 8-second clip. Every actual algorithm on the internet punishes you for it. This is the gap nobody wants to talk about, so let's talk about it.

    What the TikTok algorithm actually measures

    TikTok's ranking system has three dominant signals (publicly discussed by multiple TikTok PMs and documented in leaked internal docs):

    1. Completion rate. Did the user watch to the end? If your video is 8 seconds long, you need them to watch all 8 to count as a “completion.”
    2. Watch time. Total seconds the user spent watching. An 8-second completion is 8 seconds of watch time. A 60-second completion is 60 seconds — 7.5× more.
    3. Re-watch rate. If someone loops the video, that multiplies watch time. Short loops are good; but only if the total hand-time is high.

    You already see the problem. On every signal that actually matters, a 60-second video with 50% retention dramatically out-performs an 8-second video with 100% retention.

    The math, more explicitly:

    • 8-second video, 100% retention:
      Watch time = 8 sec  |  Completion rate = 100%  |  Algorithm score (normalized) = 8
    • 60-second video, 50% retention:
      Watch time = 30 sec  |  Completion rate = 50%  |  Algorithm score = ~25
    • 60-second video, 70% retention (realistic for good hooks):
      Watch time = 42 sec  |  Completion rate = 70%  |  Algorithm score = ~42

    At 70% retention on a 60-second video, you're delivering 5.25× the algorithm value of a perfect 8-second clip — and nobody hits 100% retention anyway, because the watch-until-last-frame rate on TikTok is generally 15–35% even for top-performing creators.

    But aren't AI video models capped at 8 seconds for a reason?

    Yes — and the reason is compute cost, not user value. A cinematic text-to-video model like Sora or Veo burns huge GPU time per generated second. The 8-second cap is an economic ceiling, not a product feature.

    For short-form social video, you don't need a text-to-video model. You need:

    • An avatar (recorded once, animated forever).
    • A voice (recorded once, spoken forever).
    • A script tuned for retention.
    • Cuts, captions, B-roll overlays.

    Every one of these is cheap to generate. Stitched together, they give you 60 seconds of publishable vertical video for roughly the same cost as 8 seconds of Sora output.

    That's the architectural bet Sarra made: avatar-plus-voice at 60 seconds beats text-to-video at 8 seconds for everyone except cinematic demo-makers. And the 60-second path happens to be exactly what TikTok, Instagram Reels, and YouTube Shorts want.

    The hook-proof-offer-CTA arc needs 60 seconds

    Every high-performing short-form video I've studied (and I've studied a lot — Sarra's script model was trained on ~50,000 of them) follows the same four-beat structure:

    1. Hook (0–3 sec). Pattern interrupt. Question. Claim. Shocking visual.
    2. Proof (3–20 sec). Why should I trust this? Demo, review, before/after.
    3. Offer (20–45 sec). What's in it for me? Price, benefit, specific outcome.
    4. CTA (45–60 sec). What do I do next? Link, comment, follow.

    You cannot fit this in 8 seconds. You can fit a hook. Maybe a hook and a hint of proof. The offer and CTA live nowhere — which is why 8-second videos convert to zero clicks, zero follows, zero purchases.

    8-second AI videos are good for launch posts. 60-second AI videos are good for small businesses. These are not the same job.

    What we did instead at Sarra

    Sarra ships 60-second vertical videos from day one. Every script our engine writes follows the hook-proof-offer-CTA arc. Every generation includes captions tuned for silent watching (90%+ of TikTok views are muted), auto-retention cuts, and a CTA frame.

    The output is less cinematically impressive than a Sora demo. It is dramatically more useful for actually growing a small business.

    If you're a small-business owner about to sign up for an AI video tool based on its 8-second demo: ask yourself what the algorithm you're posting to actually rewards. Then pick the tool that aligns.

    — Idan

    Read this next

    Ready when you are

    Create your first video.

    Paste a link, type an idea, or write your own script. Sarra does the rest — in about ten minutes, on your phone.

    Read more posts