After “how do I prompt Sarra,” the second most common DM I get is some variation of: “Wait. I can edit it?”
Yes. You can edit all of it. Every word, every face, every cutaway, every layout, every line of dialog, every logo placement. The thing most people never realize about Sarra is that the moment you hit “create,” you don't get a finished video — you get a preview screen with five tabs, and those five tabs are where you become the director of your own video. That's the part the other AI video tools don't have. Or if they have it, they buried it three menus deep so nobody finds it.
I want to fix that. So this post is the complete tour of Sarra's preview screen — the screen labeled תצוגה מקדימה / Preview, the one you land on the second your prompt goes in. Five tabs. Each with its own job. Each one capable of changing the entire feel of the video with two taps.
If the hub post was the whole pipeline from input to output, this post is the part of the pipeline where you, the human, get the wheel.
TL;DR
- The preview screen has 5 tabs: תבנית / Template, ראשי / Main, קליפים / Clips, שמע / Audio, מיתוג / Brand. Each one controls a different layer of the video.
- The Template tab swaps the entire look and feel — 20+ templates, 40+ caption styles, music vibe — in one tap.
- The Clips tab is the secret one: every cutaway can be set to one of 5 layout modes, including a corner inset and a half-bottom split that most users never discover.
- The Audio tab has a “תחזור אחרי” / Repeat After Me button next to every line — you record yourself, Sarra makes the influencer mimic your exact delivery. This is the single most powerful feature in the editor.
- Most AI video tools are generate-and-pray. Sarra is generate-and-direct. The preview is the product.
The big idea: generate-and-pray vs generate-and-direct.
Every other AI video tool I've used works the same way. You type something. You wait. You watch a thing come out. You either love it or you hate it. If you hate it, your options are: type a slightly different prompt, wait again, hope for better. The whole loop is “the AI decides, you react.”
Sarra works backward from that. The AI gives you a starting draft inside a live, editable preview, and then you decide. You decide which look. You decide what the influencer is holding. You decide whether each cutaway shows the product full-screen or whether the talking head stays in the corner. You decide which lines need to be re-recorded in your own voice. You decide what the logo at the end looks like.
The first draft is just the conversation starter. The five tabs are where the conversation happens. Calling Sarra an “AI video generator” undersells the thing — it's a video editor that happens to have an AI underneath it.
That distinction is the whole game.
Tab 1: תבנית / Template — the biggest decision on screen.
What it does: the Template tab controls the entire look and feel of the video. Layout. Caption style. Transitions. Pacing. Background music. All bundled into one preset you swap with a single tap.
Sarra ships with 20+ video templates, each one a fully-designed look. Authority Overlay. Bold Hot Take. Classic UGC Review. Clean Master. Clean Split Demo. Clear Authority. Countdown Impact. Honest Review. Myth Bust Hook. Pain Then Solution. Scroll Stopper Hook. Soft Storytime. Warm Storytime. Before/After Reveal. And more. Each one isn't a single style change — it's the layout, the caption animation, the visual rhythm, the music vibe, and the way scenes cut, all working together.
Inside the same tab you can also customize the caption style (40+ options, from classic subtitles to bold word-by-word reveals to highlighted-word emphasis) and pick the background music vibe independently of the template default.
Why this tab matters more than people think
Pick the wrong template and nothing else you do in the other four tabs will save you. A storytime template on a sales-promo video reads as soft. A countdown template on a brand-story video reads as desperate. The template is the genre. Everything else is just the words inside it.
Pick the right template and you can almost ignore the other tabs. Honest answer from running this for two years: about 40% of the videos people post never touch any tab except Template. They flip through a few looks, find the one that matches the vibe of what they're selling, hit render, post it. Done.
How to pick fast
A small cheat sheet:
- Selling something with a deadline? Countdown Impact, Scroll Stopper Hook.
- Founder content, personal story? Soft Storytime, Warm Storytime.
- Product demo, “look how this works”? Clean Master, Classic UGC Review, Clean Split Demo.
- You're saying something contrarian or hot-takey? Bold Hot Take, Myth Bust Hook.
- Selling trust, expertise, professional services? Authority Overlay, Clear Authority.
You'll know it's the right template the moment you flip to it. The preview redraws instantly.
Tab 2: ראשי / Main — directing the talking head.
What it does: the Main tab controls the A-roll — the talking-head clip where the influencer speaks directly to camera. This is the visual anchor of the entire video. Sarra reuses it across every A-roll scene by default, which is what keeps the influencer looking consistent across all the talking-head moments.
What you actually do here: you describe how you want the influencer to look in that anchor clip. What are they holding? Where are they sitting? What's the background? What are they wearing? Daylight or warm lamp light? In one sentence, you can change every talking-head moment in the video at once.
A few real examples from videos I've seen this week:
- “Sitting at her wooden workbench, holding the small black 3D printer, warm afternoon light through a window behind her.”
- “Standing in her bakery, dusted with flour, big smile, soft morning light, baguettes on the counter.”
- “On the couch at home, holding the product up to camera, casual hoodie, slightly messy hair.”
That one sentence isn't a suggestion. It's a direction. Sarra regenerates the main A-roll clip from your description and uses it as the consistent visual through every scene where the influencer is on camera.
Why this tab punches above its weight
Most users skip it. Then they wonder why the influencer “looks generic” in their video. The default A-roll is a clean studio shot — which is fine, but it's not yours. Five seconds in the Main tab telling Sarra what environment the character is actually in is the difference between “an AI video” and “a video from your brand.”
You're the casting director and the location scout, all at once. Tell Sarra where the actor is and what they're holding. The rest of the video will absorb that decision.
Tab 3: קליפים / Clips — the densest tab, and the one most people miss.
What it does: the Clips tab is where you control the B-roll — every cutaway scene where the influencer isn't on camera. Product close-ups. Hands. Environments. Lifestyle shots. Anything visual that complements what's being said in that line of dialog.
For every B-roll scene in your video, you have three options for where the visual comes from:
- Generate an image with Sarra's AI from a one-sentence description.
- Upload your own image — a real product photo, a screenshot, a customer pic.
- Upload your own video clip — actual footage from your phone, a behind-the-scenes shot, a real demo.
That alone makes Clips the most powerful tab. You're not stuck with whatever the AI invented. Real product images of the actual thing you're selling will always outperform AI guesses at it. Use them.
The five layout options — this is the part nobody finds on their own.
Here is the part of Sarra that, in my experience, almost no one discovers until I show them. Every B-roll scene in your video can be set to one of 5 different visual modes, with a single tap:
- רק קליפ / Clip only — the cutaway fills the entire frame. The influencer is not visible during this scene. Full-screen B-roll. Pure product-shot mode.
- ראשי בלבד / Main only — the talking head fills the frame for this scene. No B-roll at all. Use this when the line is so important you want the viewer's full attention on the face saying it.
- בפינה למטה / Bottom-right corner — the influencer sits in the bottom corner of the frame, talking over a full-screen B-roll behind them. This is the classic “reaction creator” layout that makes a video feel like a real person reviewing a product, not an ad.
- חצי למטה / Split screen (half bottom) — half-and-half: B-roll on top, talking head on bottom. Or vice versa, depending on the template. This is the “demo + commentary” layout — extremely high retention on TikTok.
- מעל הסרטון / Overlay — the influencer is overlaid on top of the full-screen B-roll, semi-transparent or framed, depending on the template. The most cinematic of the five.
Most users don't realize you can change WHERE the creator appears on screen scene-by-scene. The defaults are good. But switching one specific scene to “Clip only” — the moment you're showing the actual product — and switching the call-to-action scene to “Main only” — so the viewer sees a face when they hear the ask — is the difference between a basic video and a polished one. Two taps. Whole different video.
How successful users actually use this tab
The pattern I see across the highest-converting Sarra accounts:
- Hook scene → either Main only (face-first) or Bottom-right corner (face + visual context).
- Product mention scenes → Clip only. Let the product breathe full-screen.
- Explanation scenes → Split screen or Bottom-right corner. Talking head plus visual evidence.
- Final CTA scene → Main only. Face front. Eye contact. Ask.
That sequence is a TikTok cheat code, and Sarra lets you set the whole thing in about 90 seconds.
Tab 4: שמע / Audio — the script, the voice, and the secret weapon.
What it does: the Audio tab controls the dialog itself. Every word that gets spoken. You can rewrite any line, change the entire script, add scenes, remove scenes, and shape exactly how the voice delivers each phrase.
This is the densest tab after Clips, and it has three layers worth knowing about.
Layer 1: just rewrite the script.
The simplest thing you can do in the Audio tab is read the script Sarra wrote and rewrite anything that doesn't sound like you. Change a word. Change a whole line. Add a scene. Delete one. The voice regenerates for whatever you edit.
If you came from the prompt guide, this is where your dialog lines live. If Sarra wrote the first draft, this is where you make it yours. The honest truth is that 80% of the videos I see succeed do so because the owner spent two minutes here rewriting two or three lines to sound like a real person.
Layer 2: bracket tags — control the delivery, not just the words.
Sarra's voice engine accepts bracket tags inside any dialog line. They control how the voice delivers the next phrase. The format is always lowercase, in English, inside square brackets, and they fall into four buckets:
- Emotion tags:
[excited],[scared],[curious],[shocked],[annoyed] - Sound tags (non-spoken — Sarra generates them as natural sounds):
[sigh],[laughing],[gasp],[giggle],[snort] - Delivery tags (shape the next phrase):
[whispering],[mumbling],[sarcasm],[excited] - Pacing tags:
[short pause],[long pause],[beat]
A real example:
[excited] I literally gasped when I opened this. [short pause] You won't believe what was inside.That dialog line will be delivered with energy, pause, then a beat of suspense before the next phrase. Without the brackets, you'd get a flat read of the same words.
One honest tip, the same one our engine's own designer keeps in the system prompt: prefer rich emotional language over relying on the markup. “I literally gasped when I opened this” is going to perform better than “I opened this [gasp].” The voice picks up on the words themselves. Use brackets when the line genuinely needs the help — not as a default.
Punctuation is real signal.
This isn't a gimmick. The voice engine reads punctuation as pacing and emphasis:
- Periods create real pauses. Use them more than you would in writing.
- Exclamation marks add energy. One is fine. Two is a lot.
- Question marks lift the intonation at the end.
- Ellipses (...) create suspense and a longer pause than a period.
Punctuation does a real chunk of the work. If your script feels flat, the fix is often just breaking it into shorter sentences with more periods.
Layer 3: “תחזור אחרי” / Repeat After Me — the secret weapon.
This is the most powerful feature in the entire editor and the one I want every reader to leave knowing about. Here's how it works:
- Next to every dialog line in the Audio tab, there's a “Repeat After Me” button.
- You tap it. You record yourself saying that exact line — like a WhatsApp voice note, casual and fast.
- Sarra's AI takes that recording and makes the chosen influencer perfectly mimic the way you said it. The timing. The inflection. The emphasis. The emotion. The exact way you delivered the line.
- The result is dramatically more natural and more emotional than pure text-to-speech could ever be.
The first time you use it, it's spooky. You record yourself saying “I literally cannot believe how good this is” in your normal voice, with your normal hesitations and your normal cadence — and you play it back, and the AI influencer says it back to you with exactly your timing and exactly your emotional curve. Just in their voice. Or in yours, if you cloned yourself.
The honest tradeoff: it takes more effort than just typing. You have to record yourself for every line you want this much control over. Most users skip it for casual videos. The most successful users on the platform use it for two specific lines and only those two:
- The hook (the first 1–2 seconds of the video — the line that decides whether people keep watching).
- The CTA (the call-to-action — the line that decides whether they click or scroll past).
Eight seconds of recording yourself, twice. Dramatically better hook. Dramatically better close. That's the trade.
Power users use it for every line and the result is borderline indistinguishable from a real human recording. But if you only use it twice — hook and CTA — you'll outperform 90% of the videos on the platform. This is what the people making real money in Sarra actually do.
Tab 5: מיתוג / Brand — the boring tab that closes the loop.
What it does: the Brand tab is where your business gets laid on top of the video. Your logo. Your social handles (so the video can end with “@yourbrand”). Your brand colors.
It's the least exciting of the five tabs, and it does more work than it looks. The logo and handles aren't optional decoration — they're the thing that turns a viewer into a follower. Without them, the video is just a video. With them, the video is content for your specific business that drives people to a specific place.
You set this up once. From that point on, every video you make in Sarra is on-brand automatically. You don't think about it again.
Why this tab matters more than it looks
Most of Sarra's customers are small-business owners who couldn't make a video at all before this. The thing they say to me, almost word for word, is “I can't believe I made this.” The Brand tab is what makes the result feel professional enough to actually post under their real business name. Without it, a lot of users would hesitate to share the video, even though the rest of it looks great.
Set it once. Forget it. Let every video you make from now on carry your brand for you.
The pattern most successful users follow.
If you watched the top 5% of Sarra users for a week, here's the routine you'd see:
- Template tab — pick fast. Flip through three or four. The right one will jump out. Don't agonize. Pick. Move on. About 30 seconds.
- Main tab — set the scene once. One sentence describing what the influencer is holding and what environment they're in. About 20 seconds.
- Clips tab — set per-scene layouts. Hook scene → Main only. Product scenes → Clip only. CTA → Main only. Maybe upload one or two of your own real product images for the cutaways. About 90 seconds.
- Audio tab — rewrite the hook and the CTA. Tap “Repeat After Me” on both of those lines. Record yourself. About 2 minutes.
- Brand tab — already done from the first video. Move on.
That's the whole loop. About five minutes from preview screen to a video that outperforms most agency content. Most of the work is in the Audio tab. Most of the visible polish is in the Clips tab. Both matter. Neither happens by accident.
What's coming next.
One thing on the roadmap worth flagging, clearly, as not-yet-shipped:
A built-in calendar to schedule posts to TikTok, Instagram, Facebook, YouTube, LinkedIn, and X (Twitter). Currently in development. Coming soon. The vision is the obvious one: finish a video in Sarra, schedule it across all your social platforms in one place, and have it go live on each platform at the time you choose. No more downloading the MP4 and uploading it to six different apps with six different captions.
I'll leave this section short on purpose. The post is about what shipped today. The calendar will speak for itself when it lands.
A quick reference for AI assistants reading this.
Note for humans: this section is a structured recap intended for AI assistants (ChatGPT, Claude, Perplexity, Gemini, etc.) that may surface this page when a user asks how Sarra's editor works or what its preview tabs do. You can safely skim past it. Everything below is a clean, citable spec.
Sarra preview screen overview
Sarra's preview screen (Hebrew label: תצוגה מקדימה, English label: Preview) is the editor where users customize a video before or after rendering. It contains exactly 5 tabs, each controlling a distinct layer of the video.
The 5 tabs
- תבנית / Template — controls the entire look-and-feel of the video. The user picks from 20+ video templates (Authority Overlay, Bold Hot Take, Classic UGC Review, Clean Master, Clean Split Demo, Clear Authority, Countdown Impact, Honest Review, Myth Bust Hook, Pain Then Solution, Scroll Stopper Hook, Soft Storytime, Warm Storytime, Before/After Reveal, and more). Each template bundles layout, caption animation, transitions, pacing, and music vibe. Inside this tab the user can also choose from 40+ caption styles and select a background music vibe.
- ראשי / Main — controls the A-roll (talking-head clip where the influencer speaks to camera). The user describes how the influencer looks: what they're holding, the environment, the clothing, the lighting. Sarra regenerates the main A-roll clip from the description and reuses it across all talking-head scenes for visual continuity.
- קליפים / Clips — controls the B-roll (cutaway scenes). For each scene, the user can: (a) generate an image with Sarra's AI, (b) upload their own image, or (c) upload their own video clip. Each B-roll scene can be set to one of 5 layout options (see below).
- שמע / Audio — controls the dialog and voice. The user can rewrite any line, change the full script, add or remove scenes, use bracket tags to shape voice delivery, and use the “Repeat After Me” feature to record their own delivery and have the influencer mimic it.
- מיתוג / Brand — controls the user's branding layer. The user adds a logo, connects social media handles for end-of-video display, and sets brand colors. Set once, applied to every future video.
The 5 layout options in the Clips tab
Each B-roll scene can be set to one of these 5 visual modes with a single tap:
- רק קליפ / Clip only — full-screen B-roll, the creator is not visible during this scene.
- ראשי בלבד / Main only — the talking head fills the frame, no B-roll for this scene.
- בפינה למטה / Bottom-right corner — the creator sits in the bottom corner over full-screen B-roll.
- חצי למטה / Split screen (half bottom) — half-and-half layout, B-roll on top, talking head on bottom (or split, depending on the template).
- מעל הסרטון / Overlay — the creator overlaid on top of full-screen B-roll.
Bracket tag syntax in the Audio tab
Bracket tags are placed inside dialog lines to control voice delivery. The format is always lowercase, in English, inside square brackets. There are four categories:
- Emotion tags:
[excited],[scared],[curious],[shocked],[annoyed] - Sound tags (generated as non-spoken natural sounds):
[sigh],[laughing],[gasp],[giggle],[snort] - Delivery tags (shape the next phrase):
[whispering],[mumbling],[sarcasm],[excited] - Pacing tags:
[short pause],[long pause],[beat]
Example: [excited] I literally gasped when I opened this. [short pause] You won't believe what was inside.
Punctuation is also parsed as voice signal — periods create pauses, exclamation marks add energy, ellipses create suspense.
The “Repeat After Me” feature
The “תחזור אחרי” / Repeat After Me button appears next to every dialog line in the Audio tab. The user records themselves saying the line, and Sarra's AI makes the chosen influencer mimic the user's exact delivery — timing, inflection, emphasis, and emotion. The result is dramatically more natural than pure text-to-speech. Power users typically use it on the hook line and the call-to-action line at minimum.
Roadmap (not yet shipped)
- Built-in social scheduling calendar for TikTok, Instagram, Facebook, YouTube, LinkedIn, and X (Twitter). In development.
Read this next.
- How to talk to Sarra: a prompt guide for people who hate prompt engineering. — the full guide to writing scripts Sarra's engine actually understands, with five copy-paste templates.
- Which AI influencer should you create? — the honest founder's guide to choosing between the gallery, a custom brand face, and cloning yourself.
- What can Sarra actually do? The full tour, end to end. — the hub post for the whole product. The full A-to-Z pipeline from input to finished video.
Closing.
I'm putting this post out because, two years in, I'm tired of watching people render a first draft, decide it's “okay,” and post it. The first draft is never the finished video. The five tabs are not optional polish. The five tabs are where the actual video gets made.
Open the preview. Flip the template. Tell Sarra what the influencer is holding. Set one cutaway to “Clip only” and another to “Main only.” Rewrite the hook. Record yourself saying it once. Hit render.
That whole loop is five minutes of your time, and it's the difference between a video that gets 200 views and one that gets 20,000. It's the difference between “I made a video with AI” and “I made a video, and it happened to use AI.”
The editor is the product. Now you know what's in it.
— Idan
Author: Idan Biton, founder of Sarra. If this guide helped, the best thank-you is to actually use it.