Ai-video-generator-claude seedance-testimonial-story
Generate customer testimonial and social proof video prompts for Seedance 2.0 on Higgsfield. Use for customer stories, case study videos, review showcases, social proof content, success story videos, or any content featuring customer results and experiences. Triggers on testimonial, review, case study, social proof, customer story, success story, results, feedback, client.
git clone https://github.com/rediumvex/ai-video-generator-claude
T=$(mktemp -d) && git clone --depth=1 https://github.com/rediumvex/ai-video-generator-claude "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/08-testimonial-story" ~/.claude/skills/rediumvex-ai-video-generator-claude-seedance-testimonial-story && rm -rf "$T"
skills/08-testimonial-story/SKILL.mdTestimonial Story — Customer Social Proof Video Prompts
Seedance 2.0 on Higgsfield — Input Specs
- Images: Up to 9 (customer photo, product screenshots, result charts, brand assets, star ratings)
- Videos: Up to 3 (customer-recorded clips, screen recordings, existing brand video)
- Audio: Up to 3 (customer voiceover clip, background music, ambient sound)
- Output: 4-15 seconds, 720p with synchronized audio
- Reference Syntax:
for uploaded assets@material[name]
Why Traditional Testimonials Fail
Traditional testimonials fail because they're static talking heads in flat lighting. Cinematic social proof makes viewers feel the transformation. Show the result first, the emotion second, the person third.
The 2-Second Hook for Testimonials
The first two seconds determine whether the viewer stays. Every testimonial video needs an immediate proof signal.
Hook Patterns
| Hook | Mechanism | When to Use |
|---|---|---|
| The Result First | Open on the metric, number, or outcome — then reveal the person behind it | Strong results with specific numbers |
| The Emotion Capture | Open on a genuine reaction moment — a smile, a gasp, a laugh — before context | High-emotion stories with visual customer footage |
| The Number Impact | Large, bold statistic fills the frame, animated to count up or slam in | Quantitative case studies with clear ROI |
| The Contrast Story | Split-second flash of "before" state immediately followed by "after" | Transformation stories with clear before/after |
Hook Templates
THE RESULT FIRST: "Single bold statistic center-frame. Large white typography on dark background. Number counts up rapidly — 0 to 347% — in 1.2 seconds. Each digit change accompanied by soft click sound. Camera slight push-in during count. At 1.5s: number resolves. Holds for 0.3s. Then: cut to or dissolve into customer's face — warmly lit, looking slightly off-camera, beginning to speak. Viewer is already primed: this person got that result."
THE EMOTION CAPTURE: "Open on tight close-up of customer's face at moment of genuine positive reaction — mid-laugh, eyes lit, natural expression. Warm key light from slight left angle. Shallow depth of field, background is soft-bokeh environment related to their work or home. No speech yet — just the face. Hold 0.8s. Then pull back slightly to include context. Sound: ambient environment fills in, then music enters soft underneath. The emotion happened before the words. The words explain what we already felt."
THE NUMBER IMPACT: "Black frame. Single word appears — the relevant metric category (REVENUE / HOURS SAVED / CLIENTS ADDED) — in thin caps, gray. 0.5s pause. Then: the number SLAMS in from below, large, white, kinetic. Overshoot bounce physics on landing. Camera shakes 2px on impact. Sound: deep bass thud synced to number landing. Reverb tail. Number holds 0.8s. Then secondary text fades in below: customer name, company, timeframe. Credibility layered after impact."
THE CONTRAST STORY: "Split-screen. Left side: grainy, slightly desaturated footage or image representing the 'before' state — the problem, the struggle, the mess. Right side: clean, warm, vibrant footage representing 'after.' Both sides hold for 1.5s. Then: left side collapses toward center, right side expands to fill full frame. Sound: before side has flat, muted audio. Transition whoosh. After side has rich, warm audio treatment. The visual argument is already made before a word is spoken."
Visual Formats for Testimonial Videos
The Documentary Interview
The gold standard for testimonial credibility. Feels earned, not staged.
DOCUMENTARY INTERVIEW TEMPLATE: "Customer seated slightly off-center — rule-of-thirds composition, looking toward empty space in frame (not directly at camera). Interview framing: head and shoulders visible, some environment in background. Background: deliberately chosen to reinforce context — workspace for B2B testimonial, home environment for consumer product, outdoors for lifestyle brand. Background should be in soft bokeh, identifiable but not distracting. Camera: slightly below eye level (more authority) or eye level (more intimacy). Static or very slow drift — no handheld jitter. Lighting: warm interview setup. Key light 45 degrees from subject's face, soft modifier. Fill light on shadow side, 1.5 stops under. Optional: practical lamp or window in background for depth. On-screen text: customer name and result appear as lower-third approximately 2 seconds into clip. Clean sans-serif typography. Small logo mark if applicable. Sound: customer voice is primary. Light room tone/ambient underneath. Optional subtle music bed — below the voice, supporting emotional register. Voice should be crystal clear. No reverb. No noise."
The Data Visualization
For B2B and SaaS social proof. Numbers made visceral.
DATA VISUALIZATION TEMPLATE: "Dark background, premium data environment. Chart or graph central to frame. Data is not static — it builds, grows, animates to tell the story. Opening: flat baseline (before state). Chart slowly builds the historical context — flat growth, struggle plateau. Muted colors, minor key ambient. At transition point: color shift. Chart line diverges upward sharply. Color warms, saturates. Growth curve accelerates. Sound: ascending tone, subtle but audible. At destination: the current result. Number appears adjacent to endpoint of the line or bar. Large, clear, warm-colored. Camera slowly pushes in toward the data throughout. Customer photo or logo appears in lower-third as attribution for the result. Optional: voice-over of customer describing the moment of change — synced to the transition inflection point in the chart. Typography: minimal, clear. Result numbers are 2x the size of labels. Timeframes labeled on axis. One visual message only — do not clutter."
The Split Journey
Before and after as parallel narratives. Emotional and visual contrast.
SPLIT JOURNEY TEMPLATE: "Frame divided horizontally. Top half: 'before' world. Bottom half: 'after' world. Both sides run simultaneously, showing parallel actions or states. Top (before): slightly desaturated. Environment is cluttered or stressful. Person's movement is hurried, tense. Color grade: cool, 6000K+. Actions: multiple steps, visible friction. Camera: slight handheld to convey instability. Bottom (after): full color saturation. Environment is clean, ordered. Person's movement is relaxed, deliberate. Color grade: warm, 3200K. Actions: single step, effortless. Camera: locked off, stable. Dividing line: can be a hard edge, a gradient, or a thin branded line element. Option: the line moves through the frame like a reveal, with the 'after' state expanding. At resolution (3/4 through video): bottom half expands to fill full frame. Top half disappears. The 'after' world takes over. Sound: background noise of 'before' fades out. Warm audio environment of 'after' fills in. Customer quote appears as text overlay."
The Montage Proof
Multiple customers or multiple results in rapid sequence. Creates overwhelming evidence.
MONTAGE PROOF TEMPLATE: "Rapid sequence of social proof elements. 3-7 shots, 1-2 seconds each. Each shot: different customer, same transformation narrative. Shot structure per customer: face OR result metric. Alternate — face, metric, face, metric. Creates visual rhythm. Face shots: consistent lighting treatment across all — signals production intentionality, makes diverse customers feel curated. Quick zoom-in or push-in during each shot for energy. Metric shots: bold typography, one number per frame, consistent type style. Numbers are large — primary focal point. Transitions: match-cut on similar compositions. Avoid random cuts. Face-to-face: cut on eye-line. Metric-to-metric: cut on number position. Sound: music track with clear BPM structure. Each cut lands on a beat. Optional: brief audio snippet from each customer (0.5s) that blends into overall mix — creates texture without full sentences. End frame: compilation stat — 'Join 2,400+ customers.' or summary metric. Warm, wide shot. Music resolves."
Typography and Text Overlay Techniques
Displaying Quotes
QUOTE OVERLAY — MINIMAL: "Quote text appears as large white serif or sans-serif. Weight: bold or black. Centered or left-aligned. Max 15 words per screen — cut longer quotes. Text enters: fade-in over 0.3s, or slides up from below. Holds for reading time (calculated: word count ÷ 3 = seconds needed). Exits: fade or dissolve on transition. Attribution line (name, title, company) appears below in smaller, lighter weight, 1 second after main quote settles."
QUOTE OVERLAY — DRAMATIC: "One word or phrase at a time. Each key word appears large, centered. Camera or background reacts to each word — subtle push or pull. Word duration: 0.5-0.8s each. Final phrase: holds longer, larger size. Sound: soft click or tone on each word appearance. Creates reading rhythm. Color: white on dark, or brand color on neutral. Never reverse — dark on light reads poorly on video."
Star Ratings
STAR RATING ANIMATION: "Five stars arranged horizontally. Stars appear left to right, 0.15s stagger. Each star: small scale-from-zero animation with slight overshoot. Color: gold/amber (#F5A623 or similar). Optional: particle burst on fifth star landing. Sound: ascending five-note arpeggio, one note per star. Final note is higher, brighter. Stars hold for 1s then either remain as overlay or transition out. Platform attribution (G2, Trustpilot, App Store) appears below in small text."
Metrics and Numbers
COUNT-UP NUMBER: "Number begins at zero, counts up to target value. Speed: slow start, accelerates, then decelerates into final number (ease-in-out curve). Duration: 1.5-2.5s for numbers under 1,000. For large numbers: start at ~80% of final value and count last 20% only — captures the drama without waiting. Typography: large, bold, sans-serif. White or brand color. Unit (%, x, $) in slightly smaller size positioned to upper-right of number. Context label in smaller size below: 'increase in MRR' or 'hours saved per week.' Sound: rapid soft ticks accelerating and decelerating with the count."
Lighting for Testimonials
Interview Warm
The standard for trustworthy, human social proof. Conveys authenticity.
"Classic interview three-point lighting for testimonial video. Key light: large soft source (simulated softbox or window) at 45-degree angle to subject's face, slightly above eye level. Warm color temperature: 3200-4000K. This is the main illuminating source. Fill light: on the shadow side of face. 1.5 to 2 stops dimmer than key. Slightly cooler than key (4500K) — creates subtle color contrast between lit and shadow sides that gives dimension. Rim/hair light: from behind and above subject. Separates head from background. Thin edge of warm light on shoulder and hair. Not a halo — subtle. Background: 1 stop darker than subject. Optional: practical light in background (desk lamp, window glow) to give depth and environment context. Result: subject appears warmly lit, three-dimensional, trustworthy. Skin tones are flattering. Eyes catch the key light — the 'catchlight' that makes a face feel alive."
Documentary Natural
Feels unplanned and authentic. Highest trust signal for skeptical audiences.
"Simulated available-light documentary treatment. Primary source: window light from one side. Hard-ish quality — not perfectly diffused. Creates a natural falloff across subject's face. No fill, or very minimal fill on shadow side (1/4 reflected light only). The asymmetry reads as real, not studio. Color temperature: matches simulated time of day — morning (5500K, slightly blue-white), midday (6500K, neutral), golden hour (3200K, warm orange). Commit to one — no mixed sources. Background: environment visible, in focus enough to be readable but not sharp. Real spaces: kitchen table, coffee shop, workshop, outdoor setting. Nothing generic. Camera: very slight handheld quality — not shaky, but not locked-off. 0.5-1px micro-movement that communicates 'someone was there filming this.' Grade: minimal processing. Slight desaturation (10%). Natural contrast. No heavy LUT. Feels like the phone captured it, but better."
Success Glow
For triumph moments, results reveals, and high-emotion payoffs.
"Warm, elevated lighting treatment that signals victory and transformation. Key light: golden-warm (2700-3000K), positioned slightly lower than standard interview — 30 degrees above eye level rather than 45. This angle catches the face differently — more like sunrise/sunset, less like office overhead. Background light: significantly warmer than subject light (2000-2500K), or a practical warm glow element (candle, fireplace, warm lamp). Background appears luminous, almost glowing. Exposure: slightly overexposed compared to interview norm — 1/3 stop hot. Highlights clip gently. Creates the visual language of 'bright' and 'successful.' Optional element: subtle lens flare or light streak from key light direction. Small — not a J.J. Abrams production. Just a hint that the light is real and present. Grade in post: lift the shadows slightly (not crush them). Add warmth in midtones. Final image should feel like the best day of that person's life."
Sound Design for Testimonial Videos
Voice-First Design
In testimonial videos, the human voice is always primary. Everything else serves it.
VOICE-FIRST MIX: "Customer voice sits at -6dB in the mix. Clear, present, centered stereo. No reverb — voice should feel intimate and close, not roomy. Eq: gentle high-pass at 80Hz (remove rumble), slight presence boost at 3-5kHz (intelligibility). De-ess if needed — sibilance is fatiguing. Music bed: -18 to -20dB under voice. Listener should barely be aware of music consciously. It should only register when the voice pauses. Music should never fight for the same frequency space as the voice — avoid music with heavy midrange when voice is present. Sound effects (if any): brief, purposeful. A notification sound for a result metric. A soft chime on a star rating. These punctuate, they do not dominate. The test: can you understand every word without concentration? If yes, the mix is right."
Ambient Storytelling
Sound environment that tells the 'after' story without words.
AMBIENT TESTIMONIAL SOUNDSCAPE: "For B2B/work context: light keyboard sounds, distant office ambience, occasional notification sound that resolves positively. Signals productivity, accomplishment, flow state. For consumer/lifestyle: warm room tone, distant city or nature ambience, coffee sounds, comfortable environment signals. Signals ease, comfort, elevated life. For high-energy results: subtle energy in the room tone — a crowd reference, outdoor air, movement and vitality. Signals a life in motion. Build: ambient starts low (barely audible). As testimonial emotional peak approaches, ambient subtly rises +3dB. Creates unconscious sense of building energy. At resolution: ambient settles back, music may swell slightly. Clean, satisfying completion."
Emotional Music Timing
MUSIC TIMING FOR TESTIMONIALS: "Music enters under opening visual hook — soft, establishing tone. If opening is 'before' state: music should be slightly tense or neutral. If opening is 'after' state: music should be warm and aspirational from start. At testimonial credibility moment (result reveal, key quote): music adds an element. A second instrument enters, or the arrangement becomes slightly fuller. The harmonic environment enriches — unconscious signal of elevation. At final frame/CTA: music reaches resolution. Satisfying harmonic landing. Not a full swell — social proof videos are not trailers — but a complete, settled ending. The viewer should feel something has been concluded. Genre guidance: - B2B/SaaS: minimal piano or guitar, light percussion, professional and warm - Consumer product: acoustic guitar, light pop production, relatable and optimistic - Health/wellness: organic instruments (piano, strings), calm and reassuring - E-commerce/fashion: upbeat but sophisticated, 105-120 BPM, modern production"
Complete Example Prompts
Example 1: B2B Case Study — The Result-First Format (15s)
SEEDANCE 2.0 PROMPT: Black frame. Single number fades in center: "312%". Large, white, bold sans-serif. Below it in smaller text, also fading in 0.5s later: "increase in qualified leads. 90 days." 0-2s: Number holds. Camera slowly pushes in toward the number — subtle, barely perceptible movement. Sound: deep, resonant single note. Ambient silence otherwise. The number is given space to land. 2-3s: Dissolve transition. Number dissolves into a face — the customer. @material[image1] used as reference for customer portrait or as direct image source. Subject seated slightly right of center in warm interview framing, looking left-of-camera. Background: soft-bokeh office environment. Lighting: warm three-point interview setup, key light from left, slight rim light separating from background. 3-7s: Customer begins speaking. Camera is static. Every word is clear. Warm room tone underneath voice. Below subject, lower-third text appears at 4s: "[Customer Name] — [Title], [Company]". Clean sans-serif, white, small. Above the lower-third, no other graphics yet — let the face tell the story. Sound: minimal music bed enters at 3s — -20dB under voice. Piano and light percussion. Warm, confident. Does not compete with voice. 7-10s: Key quote moment. The most powerful phrase the customer says. At this exact point, text appears over or beside the customer — the quote itself in large typography. White, bold, appears word by word with 0.1s stagger. Camera micro push-in: 1% zoom over 3 seconds — imperceptible but felt. Music bed adds second element (pad or second instrument) at 7s. 10-13s: Camera pulls back slightly. Product screenshot or UI appears in background on desk or screen, associating the result with the product. @material[image2] used as product UI reference. Soft focus on product UI — it is present but the customer remains the focal point. Sound: music becomes slightly fuller. Voice continues or concludes. Ambient environment enriches — background becomes more alive. 13-15s: Final frame. Customer finishes speaking. Five-star rating appears below lower-third — animated left to right, 0.15s per star stagger, gold. Platform attribution beneath stars. Product logo in upper-right corner, small, fades in at 14s. Music resolves on final second. Warm, complete. Material references: @material[image1] for customer photo, @material[image2] for product screenshot, @material[audio1] for customer voice recording if available.
Example 2: Consumer Review — The Emotion Capture (10s)
SEEDANCE 2.0 PROMPT: Open on close-up of a face mid-expression — a genuine, warm smile. Eyes slightly crinkled. The expression is specific, not posed. Tight framing: face fills 70% of frame. Background: soft bokeh, warm residential environment (living room, kitchen — identifiable but indistinct). 0-1.5s: Face holds in close-up. No words yet. Sound: warm room tone, faint background ambience of a comfortable home. Light music begins — acoustic guitar, quiet, barely audible. The face communicates everything before language. 1.5s: Camera pulls back slowly. Person becomes visible in context — seated comfortably, the product visible in their hands or on the surface near them. @material[image1] used as product reference. Natural, candid framing. Documentary natural lighting treatment: window light from one side, slight asymmetry, real and earned. 2-5s: Person speaks. Voice is clear, slightly warm in audio treatment. Room tone underneath. They are sharing a specific moment — not a general endorsement. The specificity is the credibility. At 3s: first text overlay appears — a short fragment of what they're saying, pulled into large typography. 3-4 words max. Bold, white, positioned lower- third but large — almost subtitle-sized but treated as a design element. 5-7s: Product comes into focus. Camera drifts slightly to include product more centrally. Person's hands interact with it naturally. Light catches the product. @material[image1] product reference at this framing moment. The product is shown in use, in life — not in a studio. Sound: music bed grows subtly +3dB from 5-7s. Not louder, richer — a second guitar line enters. Signals the story is reaching its point. 7-9s: Quote completes. Final sentence lands. At the moment the key word lands — the result, the feeling, the reason — a second graphic element appears: star rating (5 stars, gold, staggered entrance) and product name below. Camera finishes its drift, composition settles. 9-10s: Hold on the settled frame. Music resolves. Room tone continues. Warm, complete, real. Material references: @material[image1] for product (for visual reference/integration). @material[audio1] for customer audio if available. @material[image2] for product logo.
Example 3: SaaS Case Study — The Data Visualization (12s)
SEEDANCE 2.0 PROMPT: Dark background, near-black with slight warm undertone — #0D0B09. Premium data environment. Single line chart, minimal axes, centered frame. 0-1s: Empty chart axes appear — clean white lines, thin, on dark background. A single label: "Monthly Recurring Revenue". Sound: soft ambient electronic, deep and slow. A single ambient tone — held, expectant. 1-4s: Chart line begins building from left. It moves slowly, nearly flat — the plateau before the change. Color: muted blue-gray, #8B9DC3. This is the "before" period. The flatness is visible and intentional. Camera: static. Sound: no change. Flat and honest about what was happening. At exactly 4s: vertical dotted line appears on the chart — "Started using [Product]". Text label appears at the top of this marker line in small caps. Sound: a subtle, quiet tone — not dramatic, just a marker. A chime, brief. 4-7s: From the marker line, the chart diverges upward. New color: warm gold #F5A623. The line climbs. Not gradually — it accelerates. Camera begins a slow push-in toward the chart as it climbs. Sound: ascending ambient tone, very subtle. The musical harmonic rises with the line. Not heavy-handed — felt more than heard. At the endpoint of the line (7s): the final value appears — a bold number floating adjacent to the chart endpoint. Large, white, clear. Below it in smaller text: the timeframe. @material[image1] if a real chart/data visualization is provided as reference. 7-9s: Hold on the chart with the result visible. Camera rests. Sound: music bed settles at this moment — a two-bar loop, warm, clean. Below the chart, customer attribution fades in: "[Company Name]", small logo mark if available via @material[image2]. Customer photo in small circle avatar to left of attribution. 9-11s: Customer voice enters (if available via @material[audio1]). One or two sentences — the human confirmation of the chart's story. A single quote line appears in large typography, one phrase, above the chart. Bold, white. The data told the story; the voice confirms it. 11-12s: Product logo appears center-right. Chart and quote remain visible. Music resolves. Final composition: data, attribution, product logo — three elements of proof in one clean frame. Material references: @material[image1] for actual data chart or graph as visual reference. @material[image2] for company/customer logo. @material[audio1] for customer voice.
Example 4: Review Montage — The Flood of Proof (8s)
SEEDANCE 2.0 PROMPT: Clean white or very light gray background. Premium, minimal. 0-1s: Single large number appears center-frame, counting up rapidly. The aggregate: "4,847 reviews". Counter accelerates and decelerates into final number. Bold, dark typography on light background. Below it, five gold stars in a horizontal row — fade-in together, no stagger here, they arrive as a unit. Sound: rapid counting ticks, resolving into a satisfying two-note landing as counter stops. 1s: Hard cut. First customer face — tight portrait. Warm, documentary lighting. Off-camera look. 0.8s hold. Below face: name, city or role. Camera: slight push-in. Sound: ambient warmth. Brief audio slice of their voice — one word or half a second of speech. 1.8s: Hard cut. First result metric. Bold number, large, center-frame. Dark background (complement to light-background face shots — alternating treatment). "47 hours saved per month." Number animates in with impact. Camera: static. Sound: single bass note thud on impact. 2.6s: Cut. Second face. Different demographic — different gender, age, context than first. Same lighting treatment, same shot structure — consistency signals intentionality. Off-camera look, warm light. 0.8s. Audio slice: one-word or brief exclamation. 3.4s: Cut. Second result metric. Different color treatment, same bold typography. "3x conversion rate." Same impact animation. Sound: same bass note — creates rhythmic expectation, viewer anticipates each cut. 4.2s: Cut. Third face. Wide shot this time — a bit more environment visible. Person at their workspace. The context is meaningful. 4.8s: Cut. Third metric — this time a star rating visual with the score. "4.9 / 5.0" with star cluster. More complex graphic, slightly longer hold. 5.8s: Cut. Montage accelerates — two more rapid face/metric pairs. 0.4s each. Music track (entering at 4s at -18dB) becomes audible here, riding the rhythm of the cuts. Faces and numbers intercut like heartbeats. 7s: Hard cut to black. Single frame. Sound: everything stops. 7-8s: Slow fade in: product logo, centered, white on dark. Tagline appears below. "Join 4,847 customers." Music final note sustains and releases. Clean, complete. Material references: @material[image1]-@material[image6] for customer photos (different people). @material[image7] for product logo. All faces should represent diversity in age, gender, and context.
Prompt Rules for Testimonial Content
-
Lead with the result, not the person. Viewers decide in the first two seconds whether to trust what follows. Give them the evidence first — the number, the metric, the outcome — then introduce the human source.
-
Make every quote earn its screen time. A quote that lasts longer than 3 seconds needs to work visually as hard as it works verbally. Pair it with motion, data, or complementary imagery. Never let static text sit on a static background for more than 2 seconds.
-
Specificity is the currency of credibility. "I doubled my revenue" is weaker than "We went from $12K to $31K MRR in 11 weeks." Prompt for specific numbers, timeframes, and named outcomes wherever possible.
-
Customer voice is the primary audio signal. If you have a voice recording, it goes first in the mix. Music and sound effects serve the voice; they do not share equal footing with it. A viewer who cannot clearly hear the customer will not trust the testimonial.
-
Match the visual register to the audience. B2B testimonials require authority signals: clean environments, professional lighting, data visualization. Consumer testimonials require relatability signals: real spaces, natural light, organic moments. Never use a studio treatment for a consumer brand, or a casual treatment for an enterprise pitch.
-
Star ratings need a source. Five stars with no platform attribution reads as fabricated. Always include the platform name — even small, even subtle. The attribution is the proof behind the proof.
-
The final frame is a commitment device. The last image the viewer sees before deciding to act should be: product in context, or the customer's face with the result attribution, or the aggregate social proof signal. Never end on a generic call-to-action graphic.
-
Avoid the endorsement pose. A customer facing directly into the camera with a wide smile, speaking scripted praise, is the exact visual language of advertising everyone has learned to distrust. Aim for slightly off-camera, genuine expression, specific language. If it looks like a testimonial, it performs like one.