The 'John Smith' problem: detecting podcast guest appearances without false positives

Published: (June 8, 2026 at 01:13 PM EDT)
6 min read
Source: Dev.to

Source: Dev.to

I listen to podcasts because of people, not shows. When a researcher or founder I like goes on someone’s podcast, I want that one episode — but I don’t want to subscribe to all 400 episodes of every show they might ever appear on. There’s no button for that anywhere. So I built one: GuestVine. You follow people; whenever one of them shows up as a guest on any podcast, that single episode lands in a custom RSS feed you subscribe to once, in whatever player you already use. The fun part wasn’t the web app — it was the detection. “Did this person appear as a guest on this episode?” sounds trivial and absolutely is not. Here’s how I built it. No new player, no re-hosting audio. The whole thing is RSS in, RSS out: [Podcast Index] —> [Detection Pipeline] —> [Postgres] —> [Feed Service] —> your RSS URL ^ [Control Panel] [you]

The feed items we emit point at the original publisher’s audio file. You can play episodes right there — inline on the site, or in whatever podcast app you subscribe the feed into — but we never re-host the audio: every enclosure is the publisher’s own file, served from their CDN. We just decide what goes in the feed. Which means everything hinges on one question being answered correctly, at scale, with no human in the loop. Say you follow John Smith. I pull candidate episodes from Podcast Index and now have to classify each one. The failure modes are everywhere: His name is in the title because he’s the guest. ✅ His name is in the description because the host mentions him in passing. ❌ His name is in the title of an episode about a different John Smith. ❌ The episode has a structured tag naming him as guest. ✅ A naive substring match delivers garbage. So detection is three layers: match → score → verify. Not all evidence is equal. I match in priority order and record which signal fired: export type MatchSignal = | “person_tag” // — structured, strongest | “title_guest” // full name in TITLE + a guest cue (“with”, “feat.”) | “title_plain” // full name in TITLE, no cue | “description_guest” // full name in DESCRIPTION + guest cue | “description_plain”; // full name in DESCRIPTION, no cue (weakest)

The gold standard is the tag from the podcast namespace — structured metadata where a publisher explicitly says “this person was a guest.” When it’s present, the guesswork disappears. It usually isn’t present, so I fall back to text, and lean on “guest cue” words — with, featuring, ft, joins, sits down with, in conversation with — to separate a guest from a name-drop. Each signal has a base confidence: const SIGNAL_SCORE: Record = { person_tag: 0.98, title_guest: 0.9, title_plain: 0.6, description_guest: 0.55, description_plain: 0.3, };

Then the part I’m fondest of. A name made of two extremely common tokens — “John Smith,” “Mike Jones” — is far more likely to be a coincidental match than “Lex Fridman” is. So common names pay a tax: function commonNamePenalty(name: string): number { const tokens = name.toLowerCase().split(/\s+/).filter(Boolean); if (tokens.length COMMON_TOKENS.has(t)).length; if (commonCount >= 2) return 0.2; // “john smith” — heavy damp if (commonCount === 1) return 0.08; // “john fridman” — light damp return 0; }

Crucially, the penalty is exempt for person_tag matches — if a publisher structurally tagged the guest, I trust it regardless of how common the name is. The penalty only applies to the fuzzy text signals where coincidence is actually possible. Score collapses to three tiers, and the tier decides the action: let tier: Tier; if (score >= 0.8) tier = “A”; // auto-deliver else if (score >= 0.4) tier = “B”; // hold for verification else tier = “C”; // drop, silently

Tier Meaning Action

A structured tag, or titular guest context auto-deliver

B name present but ambiguous hold; verify before delivering

C passing mention / low signal drop

The product decision baked in here: start strict. Only Tier A auto-delivers. A missed appearance is invisible — you just never knew it existed. A wrong appearance is loud and corrosive: it teaches you the feed is junk, and you unsubscribe. For a trust product, precision beats recall every time. I’d rather under-deliver and stay credible. Tier B is the interesting middle — real signal, real ambiguity. Rather than drop it, I optionally hand it to an LLM (Claude) with the episode metadata and the person’s disambiguating context, and ask one narrow question: is this plausibly this specific person, as a guest? If it promotes the match, it ships; otherwise it stays held. The key restraint: the LLM is a tie-breaker, not the pipeline. It never sees Tier A (no need) or Tier C (not worth the tokens). It only adjudicates the genuinely ambiguous middle band. That keeps cost bounded and keeps the deterministic scoring in charge of the easy 90%. Unspecified role defaults to “host,” not “guest.” Per the spec, a missing role means host. Get this backwards and you deliver every host as if they were a guest — a flood of false positives from the highest-trust signal. Brutal. Players cache RSS aggressively. “Why isn’t my new episode showing up” was almost always the player, not me. Worth knowing before you debug your own feed generator for an hour. The whole thing is testable without the network. Match and score are pure functions over normalized episode structs, so the test suite runs against recorded fixtures — no API key, no flakiness. The detection logic above is all covered by plain Vitest unit tests, which made tuning the penalties safe. Next.js (App Router) for the control panel, API, and RSS serving · Postgres + Prisma for people/feeds/episodes/appearances and the fan-out · passwordless auth (magic link + OTP in one email) · the detection worker above on a cron · Claude for the Tier-B verifier · Vitest for the matching/scoring/feed logic. That precision-first detection is the core of GuestVine If you try it, the one piece of feedback I’d love: is getting your feed into I’m happy to go deeper on any layer — the namespace parsing, the scoring tuning, or how the RSS fan-out works across multiple feeds per user. Ask in the comments.

0 views
Back to Blog

Related posts

Read more »