如何使用 VAPI 部署 AI Voice Agent 进行客户支持

发布: 1天前 (2025年12月4日 GMT+8 16:45)

5 min read

Source: Dev.to

TL;DR

大多数语音代理在客户中途打断或通话音量突增时会出现故障。本指南展示如何使用 VAPI 的原生语音基础设施和 Twilio 的运营商级电话服务，构建能够同时处理这两种情况的生产级 AI 语音代理。可实现 < 500 ms 的响应时间、正确的抢话（barge‑in）处理，以及 API 超时时的自动故障转移。

Stack

VAPI – 语音 AI
Twilio – 电话路由
Webhook server – 业务逻辑集成

API Access & Authentication

VAPI API key – 从 dashboard.vapi.ai 获取
Twilio Account SID 和 Auth Token – 从 console.twilio.com 获取
已启用语音功能的 Twilio 电话号码
OpenAI API key – 用于访问 GPT‑4 模型

Development Environment

Node.js 18+ 或 Python 3.9+
ngrok（或其他隧道工具）用于 webhook 测试
Git 用于版本控制

Technical Requirements

公网 HTTPS 端点用于 webhook 处理程序（生产环境）
SSL 证书（Let’s Encrypt 可用）
服务器内存 ≥ 512 MB（推荐 1 GB）
稳定的网络连接（实时音频上传 ≥ 10 Mbps）

Knowledge Assumptions

REST API 集成经验
webhook 事件处理模式
基本的语音协议（SIP、WebRTC）了解
JSON 配置管理

Cost Awareness

Service	Approx. Cost
VAPI (模型 + 语音合成)	$0.05 – $0.10 每分钟
Twilio (语音分钟数)	$0.0085 每分钟
Twilio 电话号码	$1 / 月

Architecture Overview

flowchart LR
    A[Customer Calls] --> B[Twilio Number]
    B --> C[VAPI Assistant]
    C --> D[Your Webhook Server]
    D --> E[CRM/Database]
    E --> D
    D --> C
    C --> B
    B --> A

Twilio 负责电话路由；VAPI 负责语音 AI。你的服务器通过 webhook 将两者桥接。保持职责分离，以避免出现幻音（phantom‑audio）问题。

Assistant Configuration

const assistantConfig = {
  name: "Support Agent",
  model: {
    provider: "openai",
    model: "gpt-4",
    temperature: 0.7,
    systemPrompt: "You are a customer support agent. Extract: customer name, issue type, account number. If caller interrupts, acknowledge immediately and adjust."
  },
  voice: {
    provider: "11labs",
    voiceId: "21m00Tcm4TlvDq8ikWAM",
    stability: 0.5,
    similarityBoost: 0.75
  },
  transcriber: {
    provider: "deepgram",
    model: "nova-2",
    language: "en",
    endpointing: 255 // ms silence before considering speech ended
  },
  recordingEnabled: true,
  serverUrl: process.env.WEBHOOK_URL,
  serverUrlSecret: process.env.WEBHOOK_SECRET
};

Tip: endpointing = 255 ms 属于激进设置，适用于快速的客服交互。如果在抖动较大的移动网络上出现误判中断，可调高至 400 ms。

Webhook Server (Node.js + Express)

const express = require('express');
const crypto = require('crypto');
const app = express();

app.use(express.json());

// Validate webhook signatures – production requirement
function validateSignature(req) {
  const signature = req.headers['x-vapi-signature'];
  const payload = JSON.stringify(req.body);
  const hash = crypto
    .createHmac('sha256', process.env.WEBHOOK_SECRET)
    .update(payload)
    .digest('hex');
  return signature === hash;
}

app.post('/webhook/vapi', async (req, res) => {
  if (!validateSignature(req)) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  const { message } = req.body;

  try {
    switch (message.type) {
      case 'function-call':
        // Example: fetch customer data from CRM
        const customerData = await fetchCustomerData(message.functionCall.parameters.accountNumber);
        res.json({ result: customerData });
        break;

      case 'end-of-call-report':
        // Log call metrics
        await logCallMetrics({
          callId: message.call.id,
          duration: message.call.endedAt - message.call.startedAt,
          cost: message.call.cost,
          transcript: message.transcript
        });
        res.sendStatus(200);
        break;

      case 'speech-update':
        // Real‑time transcript for live‑agent handoff
        if (message.status === 'in-progress') {
          await updateLiveTranscript(message.call.id, message.transcript);
        }
        res.sendStatus(200);
        break;

      default:
        res.sendStatus(200);
    }
  } catch (error) {
    console.error('Webhook error:', error);
    res.status(500).json({ error: 'Processing failed' });
  }
});

app.listen(3000);

Important: VAPI expects a response within 5 seconds. For slow external calls, reply immediately with res.sendStatus(202) and process the work asynchronously, sending results later via the VAPI API.

Connecting Twilio to VAPI

在 VAPI 仪表盘中，找到你的助手的电话设置。
在 Twilio 控制台 → Phone Numbers → Buy Number → Configure Webhook。
将入站 webhook URL 设置为 VAPI 助手的电话端点（在 VAPI 仪表盘中提供）。

对于外呼，在创建或升级支持工单时以编程方式触发。

Monitoring Metrics (first 100 calls)

Metric	Target
Interruption accuracy	> 95 %
False barge‑ins	92 %
Transcription accuracy (noisy)	> 85 %

如果中断准确率跌破 90 %，将 endpointing 提高到 300 ms，并将语音 stability 降至 0.4，以加快截断速度。

Audio Processing Pipeline

graph LR
    A[Microphone] --> B[Audio Buffer]
    B --> C[Voice Activity Detection]
    C -->|Speech Detected| D[Speech-to-Text]
    C -->|No Speech| E[Error: Silence]
    D --> F[Intent Detection]
    F --> G[Response Generation]
    G --> H[Text-to-Speech]
    H --> I[Speaker]
    D -->|Error: Unrecognized Speech| J[Error Handling]
    J --> F
    F -->|Error: No Intent| K[Fallback Response]
    K --> G

Local Testing with ngrok

# Start ngrok tunnel (run in terminal)
ngrok http 3000

Example curl test (Node.js snippet)

const crypto = require('crypto');
const fetch = require('node-fetch');

const testPayload = {
  message: {
    type: "function-call",
    functionCall: {
      name: "getCustomerData",
      parameters: { customerId: "test-123" }
    }
  }
};

const hash = crypto
  .createHmac('sha256', process.env.VAPI_SERVER_SECRET)
  .update(JSON.stringify(testPayload))
  .digest('hex');

fetch('https://your-ngrok-url.ngrok.io/webhook/vapi', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-vapi-signature': hash
  },
  body: JSON.stringify(testPayload)
})
  .then(res => res.json())
  .then(data => console.log('Webhook response:', data))
  .catch(err => console.error('Webhook failed:', err));

Note: Free‑tier ngrok URLs expire after 2 hours. Update the serverUrl in the VAPI dashboard after each restart to avoid 404 errors.

Signature Validation

The validateSignature function compares the HMAC‑SHA256 hash of the request payload against the x-vapi-signature header. Mismatched signatures result in a 401 response, preventing replay attacks and unauthorized triggering of expensive API calls.

Customer calls → Twilio → VAPI → Your webhook → CRM/Database → (loop) → VAPI → Twilio → Customer.

如何使用 VAPI 部署 AI Voice Agent 进行客户支持

TL;DR

Stack

API Access & Authentication

Development Environment

Technical Requirements

Knowledge Assumptions

Cost Awareness

Architecture Overview

Assistant Configuration

Webhook Server (Node.js + Express)

Connecting Twilio to VAPI

Monitoring Metrics (first 100 calls)

Audio Processing Pipeline

Local Testing with ngrok

Example curl test (Node.js snippet)

Signature Validation

相关文章

从零到 Gemini Multi-Agint：我如何在 5 天内构建 Cognitive Firewall

🌑 进入黑暗：Soulbound Codex

我在 7 天内使用 Kiro 的 Spec-Driven AI Development 构建了 Yahoo Pipes 2.0

🧟 我把 Task Manager 改造成僵尸射击游戏，以节省 RAM（使用 Kiro 构建）

TL;DR

Stack

API Access & Authentication

Development Environment

Technical Requirements

Knowledge Assumptions

Cost Awareness

Architecture Overview

Assistant Configuration

Webhook Server (Node.js + Express)

Connecting Twilio to VAPI

Monitoring Metrics (first 100 calls)

Audio Processing Pipeline

Local Testing with ngrok

Example curl test (Node.js snippet)

Signature Validation

相关文章

从零到 Gemini Multi-Agint：我如何在 5 天内构建 Cognitive Firewall

🌑 进入黑暗：Soulbound Codex

我在 7 天内使用 Kiro 的 Spec-Driven AI Development 构建了 Yahoo Pipes 2.0

🧟 我把 Task Manager 改造成僵尸射击游戏，以节省 RAM（使用 Kiro 构建）

Webhook Server (Node.js + Express)

Monitoring Metrics (first 100 calls)