I Built an AI That Argues Back at Your Startup Pitch in Real Time
Source: Dev.to
PitchFire Overview

Every founder has been there. You rehearse your pitch until it sounds bullet‑proof, walk into the room, and an investor asks one question that makes everything fall apart—not because the idea is bad, but because you never had anyone argue back.
That’s what I built for the Gemini Live Agent Challenge 2026: PitchFire, a realtime AI pitch‑steelmanning agent that challenges every weak claim you make, validates every strong one, and generates a battle‑hardened pitch deck from only the arguments you successfully defended.
- Live demo:
- Source code:
How It Works
- Start speaking – Tap the orb and begin your pitch.
- Voice Activity Detection – PitchFire listens for pauses. When you stop speaking, it captures the segment, sends it to Gemini 2.5 Flash, and returns a challenge card within 2–3 seconds.
Example cards
Weak claim – “Our TAM is $50 billion.”
- Red challenge card: “A $50B TAM from what source? What year? What percentage can you realistically capture in 24 months? TAM without SAM/SOM is theater.”
- Your conviction score drops.
Strong claim – “We have 3 paying pilots at $5K/month.”
- Green validation card: Score goes up.
- End of pitch – Hit END and Gemini generates a pitch deck containing only the claims that survived.
Modes
- Interrupt Mode – Interrupts when inconsistencies are detected or you fall silent.
- Full Pitch Mode – Waits for 3 seconds of silence, then provides a full breakdown after the entire pitch.
Each card offers three actions:
- READ – View the full challenge.
- ▶ LISTEN – Hear it spoken aloud.
- ↩ RESPOND – Type a direct defense, which is sent back through Gemini to keep the conversation anchored to that specific claim.
The Technical Stack
The audio pipeline is the core of the system.
// Browser audio capture (simplified)
const audioContext = new (window.AudioContext || window.webkitAudioContext)();
const processor = audioContext.createScriptProcessor(4096, 1, 1);
processor.onaudioprocess = (e) => {
const input = e.inputBuffer.getChannelData(0);
const rms = Math.sqrt(
input.reduce((sum, sample) => sum + sample * sample, 0) / input.length
);
// Detect voice activity based on RMS threshold
// Accumulate chunks while voice is present
// When silence exceeds threshold, concatenate chunks,
// prepend 44‑byte WAV header, base64‑encode, and POST to Gemini
};
navigator.mediaDevices.getUserMedia({ audio: true })
.then(stream => {
const source = audioContext.createMediaStreamSource(stream);
source.connect(processor);
processor.connect(audioContext.destination);
});- Capture – Raw PCM16 at 16 kHz via
ScriptProcessorNode. - VAD – RMS volume per buffer determines voice activity.
- Packaging – Chunks are concatenated, wrapped in a 44‑byte WAV header, base64‑encoded, and sent to Gemini’s multimodal REST endpoint.
What Gemini Made Possible
The entire product hinges on a single, well‑engineered Gemini prompt. The model:
- Transcribes the audio.
- Analyzes each claim in one call.
- Classifies claims as weak or strong.
- Generates investor‑style challenges or validations, cites counter‑evidence, scores the claim, and categorizes it across six pitch dimensions.
Without Gemini’s ability to handle multimodal input and structured reasoning simultaneously, this product wouldn’t exist. The Gemini 2.5 Flash API reduced the build time from months to a few days.
What’s Next
- Investor persona modes – VC, angel, strategic.
- Team practice mode – Multiple founders can practice together.
- Integration – Connect with popular pitch‑deck tools.
Built solo for the Gemini Live Agent Challenge 2026.