Overdigital · Build Log

Showspring

Live · running weekly · tested by a real show

The build log for Showspring — the independent, off-hours AI project behind The Doodle Cast. The project itself is at showspring.com. This page is the workshop: the pipelines, what runs underneath, and what’s on the bench next.

180K+
Lines of Code (app + GPU pipeline)
10+
AI Models
6
Publish Platforms
3
Production Pipelines

AI tools generate clips. Showspring produces episodes.

Veo, Kling, Higgsfield, Firefly — remarkable at generating individual video clips. But producing a complete episode — with narrative structure, multi-character dialogue, consistent visuals, sound design, and music — still takes dozens of hours of manual stitching and editing. Showspring closes that gap.

Three production pipelines (full episodes, shorts, podcasts), a shared show bible, and a render engine built from scratch for AI-native output. Solo build, deployed nightly, tested in production by an actual show.

From idea to YouTube in ten steps.

The full episode pipeline. Each step has its own workspace, its own AI assist, and persists state across reloads. Skip ahead, jump back, regenerate one scene, ship the whole thing.

Step 01

Creative Director

Three AI personas independently pitch episode concepts, then debate their merits. A judge model evaluates each pitch with live web search for topical relevance and picks a winner — or you bring your own idea and let the panel pressure-test it.

Multi-model debate Web research Bring-your-own
Creative Director — AI brainstorming with multi-model debate
Step 02

Script Writer

Generate a full script with structured clips, character dialogue, scene descriptions, and image prompts — on the LLM of your choice. The writer is grounded in the show bible: a living knowledge base that evolves with every episode, keeping characters consistent and plotlines from repeating.

Configurable LLM Show bible 30+ clips / episode
Script Writer with LLM selection and template system
Step 03

Voice Readout

Every character speaks in a distinct synthesized voice. Play through the full episode readout to check pacing, dialogue flow, and structure before committing GPU cycles to visual production. Cheaper to fix a script than a render.

ElevenLabs Per-character voices Full episode playback
Script readout with voice profiles and clip navigation
Step 04

Locations

AI extracts every location mentioned in the script and maps them to clips. Each location lives in a reusable library with reference images, descriptions, and default prompts. The studio always looks like the studio — week 9 matches week 1.

Auto-extraction Reference library Cross-episode persistence
Location extraction and mapping interface
Step 05

Scene Generation

Generate images for each clip, informed by character reference sheets, location plates, and scene context. Start and end images per clip enable smooth image-to-video. Full history with undo, AI-assisted editing, and an iPad/PC watcher for hand-drawn or live-camera source frames.

Multi-model image gen Start / end frames Image history + undo
Scene generation with reference images and visual controls
Step 06

Video Generation

Turn scene images into motion using cloud video models (Veo 3.1, Kling 3.0, Seedance 2.0) for production quality, or open models (WAN 2.2, Z-Image-Turbo) on a local RTX 5090 for free iteration. An NLE-style timeline shows every clip with status badges and a composite preview player that sequences completed clips in real time.

Cloud + local routing I2V models Composite preview
Video timeline with episode preview and clip generation
Step 07

Trim Editor

Frame-accurate in/out markers per clip. Adjust durations, preview instantly. The trimmed timeline carries forward to the audio mix and final render — one source of truth from edit through publish.

Frame-accurate Real-time preview
Video preview with clip navigation
Step 08

Audio Mix

Four-lane multi-track editor: background video audio, voice dialogue, sound effects, music. Independent volume per track with keyframe automation. Generate SFX and music from text descriptions, drop them on the timeline, fine-tune the mix — all in the browser.

4-track mix Keyframe automation AI SFX + music
Multi-track audio editor with waveforms and keyframes
Step 09

Render Engine

One button, full episode render. The engine trims each clip, applies the audio mix through hand-rolled ffmpeg filter graphs, concatenates everything, and encodes the final MP4. When the local GPU is online it runs on NVENC; otherwise CPU fallback keeps shipping.

Custom ffmpeg engine GPU / CPU fallback Drive auto-upload
Render engine with progress tracking and video player
Step 10

Publish

AI thumbnails for A/B testing, AI-written metadata, tags and categories, then direct publish to YouTube with real-time upload progress. The show bible learns from each episode that ships — what landed, what didn’t — and feeds it back into the next pitch.

Thumbnail A/B AI metadata Show-bible feedback
YouTube publish with AI thumbnails and A/B testing

A parallel pipeline for vertical, batched at eight.

Eight shorts at a time from a single theme — each with unique characters, dialogue, image, animation, voice, and music. Publishes across YouTube, TikTok, Instagram, Facebook, and X with scheduled auto-posting and AI-recommended slots.

Shorts Idea Lab
Shorts · 01

Idea Lab

Eight ideas at once from the character pool, with optional research-report grounding for topical hooks.

Shorts script writer
Shorts · 02

Script Writer

Per-clip scene + action prompts, 8–12 word dialogue, visual-coherence rules baked in for clean I2V.

Shorts start images
Shorts · 03

Start Images

9:16 starting frames grounded in character references, with full image history and per-clip regen.

Shorts video generation
Shorts · 04

Video Generation

Image-to-video on cloud models for production runs or local GPU for free iteration. Same operator surface.

Shorts voice studio
Shorts · 05

Voice Studio

Multi-track audio per short — original, voice, and AI-scored music. Batch-replace or fine-tune.

Shorts render
Shorts · 06

Render

Batch render all eight, or selectively re-render. Outputs land in Drive and are downloadable on the spot.

Multi-platform publish
Shorts · 07

Publish & Schedule

Five-platform fan-out, OAuth-authenticated, visual calendar, AI-suggested posting times, alerts on misses.

Shorts schedule calendar
Cadence

Calendar view

The whole posting plan at a glance — per platform, per day — with the auto-publisher in the loop.

An audio-only feed that runs on the same bible.

Conversational episodes where the show’s characters discuss real topics — multi-voice dialogue, AI cover art, and direct publish. Same characters, same continuity, different format. 2 to 120 minute targets.

Podcast idea lab
Podcast · 01

Idea Lab

AI-pitched concepts with topic scoring, bring-your-own ideas, or deep-research mode with live web grounding.

Podcast script
Podcast · 02

Multi-Character Script

Conversation scripts with character dialogue, word-count tracking, and runtime estimates against your target.

Podcast cover art
Podcast · 03

Cover Art

Up to four AI variants per episode, informed by character refs and the episode brief. Pick one or regen.

Podcast preview
Podcast · 04

Preview & Publish

Full episode preview with character avatars and color-coded dialogue. Review the flow, then ship to the feed.

Podcast voices
Podcast · 05

Multi-Voice Dialogue

Text-to-dialogue API generates natural multi-character conversations in one audio stream — turn-taking and all.

Podcast publish
Podcast · 06

Audio Production

Auto intro/outro, configurable silence gaps, act-based batching for 15+ minute episodes — publish-ready audio.

Inside the episode: news desk, OTS graphics, a real audience.

Two systems that turn a flat clip sequence into something that feels like a show. A late-night news-desk segment generator, and a per-clip audience reaction track sourced from a multi-take stem library.

Segment · Fire Hydrant Gazette

Late-night news comedy, scripted by formula.

Extensible segment templates encode format rules, comedy mechanics, joke structures, and a small visual style guide — baked into the AI script step itself. Anchor solo shots and two-shots are rendered with OTS news graphics composited per character in the render pass.

Segment templates Joke formulas OTS compositing
Fire Hydrant Gazette anchor two-shot with OTS
Mix · Audience reactions

Studio 8H in a database.

Per-clip audience-reaction track from a multi-take stem library — thirteen reaction types, density dial, and a multimodal cue planner that listens to the cut and picks where laughs, gasps, and silence land. The audio bed is generated for the episode, not borrowed from a library anyone else can use.

13 reaction types Density dial Multimodal cue planner
Audio editor with audience reactions and dialogue lanes

One character platform: the show, the live chat, the audience.

The same show bible that produces the episodes powers two more things: live voice conversations with the cast, and a pipeline that turns viewers’ own dogs into cast members.

Live · Character Voice

The character you watch is the character you talk to.

Public cast chat on thedoodlecast.com: pick a character, tap the mic, talk. Each persona is assembled from its Showspring show-bible section (auto-refreshing when the bible changes) and speaks through the same ElevenLabs voice ID used in the episode pipeline. New cast members become talkable the moment they get a bible section — no per-character code.

Bible-driven personas Shared voice IDs Local LLM + cloud fallback
Character manager with per-character voice profiles
Pipeline · Guest onboarding

Viewers pitch their dog into the cast.

A viewer submits photos and a few lines (multi-dog households supported). A multimodal model turns the photos into structured character drafts; Showspring generates a per-dog portrait, a “Hello” voice line, and a video preview; the submitter picks their dog’s voice through a shareable voice-pick link. On approval the dog is promoted to a cast character with its own bible section and pinned voice — episode-ready, and instantly talkable in the live chat.

Photo → character drafts Voice-pick share links Promoted to cast
Per-clip voice settings in the Showspring pipeline

The persistent layer underneath every pipeline.

Pipelines come and go. The studio that holds them together — characters, episodes, media — persists across every show, every episode, every week.

Character manager
Cast

Character Manager

Personality, visual description, speech style, reference images, voice profile. The cast persists across all episodes and informs every generation.

Episode library
Library

Episode Library

Every episode across every stage of production. Filter by status, search by title, jump straight into any production step.

Media gallery
Assets

Media Library

Every image and clip across every episode. Browse by model, date, or episode. Drag-and-drop, crop, rotate — with full undo.

Locations
World

Locations & Props

First-class sets and props with their own galleries, references, and history. The same desk in episode 14 as in episode 2.

Dashboard
Console

Dashboard

One screen for the state of the studio — what’s rendering, what’s queued, what’s ready to ship.

Analytics
Feedback

Cross-platform Analytics

What actually landed, per platform, per episode — routed back into the show bible so the next pitch starts smarter.

Three layers. One studio.

The interesting design choices live in how these layers talk to each other — not which logos sit on which row.

Layer 01

Frontier models, mixed and matched

Multiple top-tier LLMs and image / video models, each pointed at the step where it’s actually best. Routed through a single internal bridge with usage tracking, budgets, and graceful fallbacks when a vendor blips.

Layer 02

A GPU in the basement

A local RTX 5090 handles the high-volume, low-stakes batch work that would otherwise burn cloud quotas — tagging, triage, draft scoring, image and video iteration. A llama-swap router serves Gemma 26B for text; WAN 2.2 and Z-Image-Turbo cover local visuals. Same bridge, different destination.

Layer 03

Hand-rolled pipeline glue

The part nobody else has: the render engine, the show bible, the continuity system, the publish pass, the analytics feedback loop. Deployed nightly, tested in production by a real show.

Why hybrid: production-grade output runs on the best cloud model for the step. Iteration runs on the local box for free. The operator never picks — the studio routes.

What’s running, what’s next.

Running this week

  • Episodes shipping on a weekly cadence.
  • Shorts auto-publishing across YouTube, TikTok, Instagram, Facebook, X.
  • Show bible with versioned characters, props, locations.
  • Gazette segments with composited OTS graphics in the render pass.
  • Per-clip audience reactions sourced from a multi-take stem library.
  • Cross-platform analytics closing the loop on what actually lands.

On the bench

  • Better continuity tooling for long-running serialised arcs.
  • Multimodal evaluation — the studio scoring its own output.
  • Lower per-episode cost by pushing the right work local.
  • Operator surface for other shows — quietly, with a few collaborators.
  • Tighter publish loop — what runs at 8 am Tuesday, why, and how to nudge it.

Want to see it in motion?

The project page has the story, the proof, and a way to get in touch if you have a show you’d want to put through it.