The Cold File — Autonomous AI True-Crime Podcast Channel
Production Python backend that fully operates a fictional dual-host true-crime podcast on YouTube — @The_Cold_File — an autonomous AI system where Claude wears multiple intelligent hats — writer, LLM-as-judge quality gate, casting director, community manager — across a single 30,000-line pipeline.
Read more
hash(case_title) % 2 using the same formula in both the script-gen prompt and the TTS layer, so the persona named in the dialogue always matches the voice that gets played.Claude is not just the writer. Opus generates concepts and full dual-host scripts across 14 case types (
MURDER_MYSTERY, COLD_CASE, SERIAL_KILLER, KIDNAPPING, CULT, WRONGFUL_CONVICTION, FINANCIAL_CRIME, HEIST, STALKER_TO_MURDER, CORPORATE_CONSPIRACY, FAMILY_MURDER, …) and 6 structural variants (linear chronology, case-file walkthrough, retrospective, whodunit, reverse-chronology, parallel investigations) — both rotated least-recently-used to keep the catalog fresh. A separate Sonnet pass is the quality gate (LLM-as-judge): it scores N parallel candidate concepts against a weighted rubric (hook 35%, retention 30%, etc.) — only the winner enters production, losers are logged for prompt-tuning. Sonnet then plays casting director: it tunes Azure MultiTalker inference parameters (temperature/top_p/cfg_scale) per host to match the case's emotional register — measured clinical vs. urgent vs. retrospective. Claude also authors every per-scene image prompt, gated through case-file atmosphere maps and safety clauses (no recognizable faces, no celebrities, documentary framing only) — fictional cases must never produce imagery that resembles real victims or perpetrators.Image generation has a two-tier fallback. Primary path is Azure Flux 2 Pro; when Flux's content filter rejects a prompt (common on crime-scene imagery), the pipeline retries the original unmutated prompt via Nano Banana (Gemini 2.5 Flash Image through Fal.ai) which has a different filter profile — so a rejected scene still ships an image instead of falling back to a generic placeholder.
Real-world signal in, abstracted prompt fuel out. A topical-signal miner pulls Google Trends + NewsAPI headlines from the true-crime space, then Claude distills them into structural beat shapes (e.g. "DNA match decades later", "convicted-killer-dies cases reopen") — never leaking real case names into the fictional show. Raw headlines stay in
raw_examples for audit only. A title miner analyzes competitor channels for winning title structures, and a YouTube Analytics feedback loop produces weekly insights that feed back into concept generation.13+ Postgres tables track everything:
stories, concept_candidates, cold_open_attempts, video_performance, weekly_insights, title_patterns, topical_signals, comment_interactions, schedule, plus dedup history that prevents repeating case concepts or titles. FFmpeg compiles 1080p video with CASE FILE / COLD CASE / UNSOLVED / CASE CLOSED thumbnail badges derived from the case's resolution field; YouTube Data API v3 handles uploads, Shorts trailers, and chapter markers. Runs hourly on Railway.Claude also runs the comment section. A Haiku-powered community engagement loop scans every uploaded video on two passes — a fast pass over the newest 15 videos every cycle to catch fresh comments, and a deep pass that rotates through the full back catalog so a comment on video #200 still gets answered within a day or two. Haiku writes the actual replies using a rotation of tone presets and pulls the original case context from Postgres so replies stay on-topic. Questions that need real case knowledge escalate dynamically to Sonnet — model routing decided per comment. Daily YouTube API quota is tracked in DB and the loop self-throttles when it hits the limit.
Repository is private — contact guch79@gmail.com for access or commercial options.