如何在 Voice AI 应用中实现上下文保留
Source: Dev.to
TL;DR
语音 AI 在回合之间会丢失上下文——用户需要重复,代理忘记之前的请求。这会破坏用户体验并浪费 API 调用。使用 VAPI 的 metadata 字段 + 服务器的内存存储(或用于规模化的 Redis)构建持久的 会话状态。在回合之间跟踪对话历史、用户意图和通话元数据。结果:代理记住上下文,延迟降低 40%,通过消除冗余澄清削减 API 成本。
前置条件
API 密钥与凭证
- VAPI API 密钥(在 dashboard.vapi.ai 生成)
- Twilio 账户 SID 与 Auth Token(来自 console.twilio.com)
- OpenAI API 密钥用于 LLM 推理(最低 gpt-4 或 gpt-3.5-turbo)
系统要求
- Node.js 18+(需要 async/await 支持)
- Redis 6.0+ 或 PostgreSQL 12+ 用于会话持久化(仅内存存储在重启后会丢失上下文)
- 并发会话处理最低 2 GB RAM
SDK 版本
vapi-sdk: ^0.8.0 或更高twilio: ^4.0.0 或更高axios: ^1.6.0 用于 HTTP 请求
网络设置
- 用于 webhook 的公网 HTTPS 端点(开发阶段可使用 ngrok,生产环境需有效 SSL 证书)
- 防火墙规则允许 443 端口入站流量
- 启用 webhook 签名校验(HMAC‑SHA256)
知识要求
- 熟悉 REST API 与 JSON 负载
- 理解会话管理与状态机
- 基础的语音通话流程与转录事件知识
vapi: Get Started with VAPI → Get vapi
步骤教程
配置与设置
首先编写助理配置。这决定了语音代理的行为——模型选择、语音提供商、转录设置,以及最关键的跨通话上下文处理方式。
const assistantConfig = {
model: {
provider: "openai",
model: "gpt-4",
messages: [
{
role: "system",
content:
"You are a customer support agent. Maintain context from previous interactions. Reference customer history when available."
}
],
temperature: 0.7
},
voice: {
provider: "elevenlabs",
voiceId: "EXAVITQu4vr4xnSDxMaL",
speed: 1.0
},
transcriber: {
provider: "deepgram",
model: "nova-2",
language: "en",
endpointing: 300
},
firstMessageMode: "assistant-speaks",
recordingEnabled: true
};
messages 数组是注入先前对话上下文的地方——这就是你的状态保留机制。
架构与流程
你的 Express 服务器接收来自 VAPI 的 webhook 事件,在内存(或生产环境的 Redis)中维护会话状态,并在每次通话前将上下文注入助理的系统提示。
User Call → VAPI → Webhook (call.started) → Your Server (Load Context)
→ Update Assistant Config → VAPI Continues Call → Webhook (call.ended)
→ Your Server (Save Context) → Database
会话状态存放在带 TTL 清理的 Map 中。当通话到来时,获取之前的对话历史,注入助理配置,并通过 /v1/calls 接口返回给 VAPI。
步骤实现
1. 初始化带 webhook 处理器的 Express 服务器
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.json());
// Session storage: Map
const sessions = new Map();
const SESSION_TTL = 3600000; // 1 hour
// Webhook signature validation (VAPI signs all webhooks)
function validateWebhookSignature(req) {
const signature = req.headers['x-vapi-signature'];
const timestamp = req.headers['x-vapi-timestamp'];
const body = JSON.stringify(req.body);
const message = `${timestamp}.${body}`;
const hash = crypto
.createHmac('sha256', process.env.VAPI_WEBHOOK_SECRET)
.update(message)
.digest('hex');
return hash === signature;
}
app.post('/webhook/vapi', (req, res) => {
if (!validateWebhookSignature(req)) {
return res.status(401).json({ error: 'Invalid signature' });
}
const event = req.body;
if (event.type === 'call.started') {
handleCallStarted(event);
} else if (event.type === 'call.ended') {
handleCallEnded(event);
} else if (event.type === 'message.updated') {
handleMessageUpdate(event);
}
res.status(200).json({ received: true });
});
2. 通话开始时加载上下文并注入助理
async function handleCallStarted(event) {
const { callId, phoneNumber, customerId } = event;
// Fetch prior conversation history from database
let priorContext = '';
if (customerId) {
const history = await fetchCustomerHistory(customerId);
priorContext = history
.slice(-5) // Last 5 exchanges
.map(msg => `${msg.role}: ${msg.content}`)
.join('\n');
}
// Build enhanced system prompt with context
const enhancedSystemPrompt = `You are a customer support agent.
Previous conversation history:
${priorContext || 'No prior history.'}
Current call: ${phoneNumber}
Customer ID: ${customerId || 'Unknown'}
Reference prior interactions. Be consistent with previous commitments.`;
// Update assistant config with context
const updatedConfig = {
...assistantConfig,
model: {
...assistantConfig.model,
messages: [
{
role: "system",
content: enhancedSystemPrompt
}
]
}
};
// Store session state
sessions.set(callId, {
context: updatedConfig,
customerId,
createdAt: Date.now(),
transcript: []
});
// Schedule cleanup
setTimeout(() => sessions.delete(callId), SESSION_TTL);
}
3. 在通话期间捕获转录,结束时保存
function handleMessageUpdate(event) {
const { callId, message, role } = event;
const session = sessions.get(callId);
if (session) {
session.transcript.push({
role,
content: message.content,
timestamp: Date.now()
});
}
}
async function handleCallEnded(event) {
const { callId, endedReason, duration } = event;
const session = sessions.get(callId);
if (!session) return;
// Persist conversation to database
if (session.customerId && session.transcript.length > 0) {
await saveConversation({
customerId: session.customerId,
callId,
transcript: session.transcript,
duration,
endedReason,
timestamp: new Date()
});
}
sessions.delete(callId);
}
错误处理与边缘情况
竞争条件
同一客户同时发起两个通话。使用锁机制:
const locks = new Map();
async function acquireLock(customerId, timeout = 5000) {
while (locks.has(customerId)) {
await new Promise(resolve => setTimeout(resolve, 100));
}
locks.set(customerId, true);
setTimeout(() => locks.delete(customerId), timeout);
}
Webhook 超时
VAPI 要求在 5 秒内返回响应。立即回复并异步处理:
app.post('/webhook/vapi', async (req, res) => {
res.status(202).json({ accepted: true }); // Respond immediately
// Process async
setImmediate(() => {
const event = req.body;
if (event.type === 'call.started') {
handleCallStarted(event).catch(err => console.error('Handler error:', err));
}
});
});
内存泄漏
如果 call.ended webhook 失败,会话不会被清理。添加周期性清理:
setInterval(() => {
const now = Date.now();
for (const [callId, session] of sessions.entries()) {
// Example cleanup condition (TTL already handled on start)
if (now - session.createdAt > SESSION_TTL) {
sessions.delete(callId);
}
}
}, 600000); // Run every 10 minutes