Boost CSAT with VAD, Backchanneling, and Sentiment Routing
Source: Dev.to
TL;DR
Most voice AI agents tank CSAT because they interrupt customers mid‑sentence or miss emotional cues.
Fix:
- Voice Activity Detection (VAD) prevents false turn‑taking.
- Backchanneling (“mm‑hmm”, “I see”) signals active listening without interrupting.
- Sentiment routing escalates frustrated callers before they rage‑quit.
Built with VAPI’s VAD config + Twilio’s call routing.
Result: 40 % fewer escalations, 25 % higher CSAT scores. No fluff—just production patterns that work.
Prerequisites
API Access
- VAPI API key (from
dashboard.vapi.ai) - Twilio Account SID + Auth Token (console.twilio.com)
- Twilio phone number with Voice capabilities enabled
Technical Requirements
- Node.js 18+ (for async/await and native fetch)
- Public HTTPS endpoint for webhooks (e.g., ngrok for local dev)
- SSL certificate (Twilio rejects HTTP webhooks)
System Dependencies
- 512 MB RAM minimum per concurrent call (VAD processing overhead)
flowchart TD
B[VAD Detection] --> C{Silence > 800ms?}
C -->|Yes| D[Inject Backchannel]
C -->|No| E[Continue Listening]
D --> F[Sentiment Analysis]
E --> F
F --> G{Score}
G -->|Yes| H[Route to Human]
G -->|No| I[AI Response]
VAD fires on every audio chunk. Your webhook receives speech-update events with partial transcripts. Sentiment analysis runs on complete utterances only—analyzing “I’m fru…” will give false negatives.
Real‑Time Sentiment Routing
The critical piece: a webhook handler that processes sentiment in real‑time and triggers routing before the conversation derails.
const express = require('express');
const app = express();
function analyzeSentiment(transcript) {
const negativeKeywords = {
'frustrated': -0.3,
'angry': -0.5,
'terrible': -0.4,
'useless': -0.6,
'cancel': -0.7,
'manager': -0.8
};
let score = 0;
const words = transcript.toLowerCase().split(' ');
words.forEach(word => {
if (negativeKeywords[word]) score += negativeKeywords[word];
});
return Math.max(score, -1.0); // Cap at -1.0
}
app.post('/webhook/vapi', async (req, res) => {
const { message } = req.body;
if (message.type === 'transcript' && message.transcriptType === 'final') {
const sentiment = analyzeSentiment(message.transcript);
// Inject backchannel if user paused mid‑sentence
if (message.silenceDuration > 800 && sentiment > -0.3) {
return res.json({
action: 'inject-message',
message: 'mm-hmm' // Non‑verbal acknowledgment
});
}
// Route to human if sentiment tanks
if (sentiment <= -0.5) {
return res.json({
action: 'transfer',
target: 'human_agent'
});
}
}
res.sendStatus(200);
});
sequenceDiagram
participant VAPI
participant User
participant Webhook
participant Server
VAPI->>User: Plays welcome message
User->>VAPI: Provides input
VAPI->>Webhook: transcript.final event
Webhook->>Server: POST /webhook/vapi with user data
alt Valid data
Server->>VAPI: Update call config with new instructions
VAPI->>User: Provides response based on input
else Invalid data
Server->>VAPI: Send error message
VAPI->>User: Error handling message
end
Note over User,VAPI: Call continues or ends based on user interaction
User->>VAPI: Ends call
VAPI->>Webhook: call.completed event
Webhook->>Server: Log call completion
Testing & Validation
Local Testing
Use the VAPI CLI with ngrok to test webhooks locally. This catches ~80 % of integration bugs before production.
# Terminal 1: Start your Express server
node server.js # runs on port 3000
# Terminal 2: Forward webhooks to local server
npx @vapi-ai/cli webhook forward --port 3000
// Example snippet inside server.js for local testing
app.post('/webhook/vapi', async (req, res) => {
const { message } = req.body;
if (message?.type === 'transcript') {
const sentiment = analyzeSentiment(message.transcript);
console.log(`[TEST] Transcript: "${message.transcript}"`);
console.log(`[TEST] Sentiment Score: ${sentiment}`);
// Add any additional debug actions here
}
res.sendStatus(200);
});
Run the above, send sample transcripts via the CLI, and verify that:
- Backchannels are injected only after the configured silence window.
- Calls are forwarded when the sentiment score drops below the escalation threshold.
- No race conditions occur (the server logs should show a single action per transcript).
After local validation, deploy the webhook to a production HTTPS endpoint, update your Twilio Voice webhook URL, and monitor real‑time metrics (escalation rate, average CSAT, latency).