An open-source, HIPAA-eligible Twilio alternative
Source: Dev.to
Note: For the most up‑to‑date setup instructions, seeopen-telephony-stack/README.mdin the repository.
Why we built this
Last summer we were building AI voice agents for healthcare practices. We needed to:
- Make and receive calls
- Stream audio in real‑time
- Remain HIPAA‑eligible
Twilio looked like the obvious choice—until we hit the paywall: $2,000 /month for a Business Associate Agreement (BAA) before a single call was placed. For a startup, that price is prohibitive.
So we created our own stack:
| Component | What it does |
|---|---|
| Asterisk | Open‑source PBX (Dockerized) |
| AWS Chime SDK | SIP trunking & phone numbers |
| FastAPI shim | Bridges legacy telephony to modern WebSocket APIs |
The result is a complete, secure telephony system that can handle inbound and outbound calls:
- Receives calls via AWS Chime Voice Connector (real PSTN number)
- Terminates SIP/TLS on Asterisk (Docker)
- Bridges audio via RTP to a WebSocket connection
- Streams base64 μ‑law audio to your AI voice server
- Exposes a Twilio‑like WebSocket API (modeled after Twilio Media Streams)
You bring your own AI; the stack only handles the phone infrastructure.
Use‑case examples
| Scenario | Why this stack helps |
|---|---|
| Healthcare AI – need HIPAA compliance without Twilio’s BAA costs | No extra compliance fees; you control the data |
| Custom call handling – Twilio limits you | Full control over dialplan, media routing, etc. |
| Full stack ownership – want to own every layer | Self‑hosted, open‑source, no vendor lock‑in |
| Learning/experimenting – understand telephony internals | End‑to‑end, from PSTN to WebSocket, all in code |
Consider alternatives if:
- You only need basic voice for a side project (Twilio is easier).
- You don’t want to manage infrastructure.
- You have no special compliance requirements.
Trade‑off: Managing this stack requires time and ongoing maintenance.
Service ports
| Service | Port | Protocol | Description |
|---|---|---|---|
| Asterisk SIP | 5061 | TCP/TLS | SIP signaling with AWS Chime |
| Asterisk ARI | 8088 | HTTP | Asterisk REST Interface (localhost only) |
| Shim server | 8080 | HTTP | FastAPI server, health endpoints |
| RTP media | 10000‑10299 | UDP | Audio streams to/from Asterisk |
Architecture overview
1. AWS Chime Voice Connector
- PSTN gateway. You provision a phone number here.
- Calls arrive as SIP/TLS on port 5061.
2. Asterisk PBX (Docker)
- Handles SIP signaling, RTP media, call routing.
- Uses ARI (Asterisk REST Interface) instead of traditional dialplan scripting.
3. Shim server (FastAPI)
| Function | Details |
|---|---|
| Connect to Asterisk via ARI WebSocket | Receives StasisStart events |
| Create ExternalMedia channels | Bridges RTP to your AI voice server |
| Maintain 20 ms RTP cadence | Insulates WebSocket jitter |
| Forward audio | Base64 μ‑law payloads to downstream voice server |
4. Your AI voice server
- Any server that can speak the Twilio‑compatible WebSocket media format (e.g., OpenAI Realtime, AWS Nova Sonic, custom ASR/TTS).
- Sample implementation:
open-telephony-stack/src/servers/voice_agent_server.py.
DNS & TLS setup
DNS record (required before TLS)
| Record type | Name | Value | TTL |
|---|---|---|---|
| A | sip.yourdomain.com | Your Elastic IP (e.g., 54.123.45.67) | 300 (or default) |
Create this A record before:
- Requesting Let’s Encrypt certificates (Certbot validates domain ownership)
- Configuring AWS Chime Voice Connector termination (Chime must resolve the hostname)
- Setting
external_signaling_addressinpjsip.conf(must match the DNS name)
After adding the record, wait for propagation (a few minutes to 48 hours). Verify with:
dig sip.yourdomain.com
# or
nslookup sip.yourdomain.com
TLS with Let’s Encrypt
- Certbot runs on the EC2 instance, bound to port 80.
- Certificates are issued for
sip.yourdomain.com. - Asterisk reads the certs from
/etc/letsencrypt/live/...via a Docker volume mount. - A renewal hook reloads Asterisk when certificates rotate.
- Chime validates the cert against Let’s Encrypt’s CA root – no self‑signed certs, no manual renewal, no surprise expirations.
Call flow (what happens when someone dials your number)
-
Caller dials your AWS Chime phone number.
-
Chime sends a SIP INVITE to your Asterisk server (
TLS:5061). -
Asterisk matches the call in
extensions.confAnswer() Stasis(voice-agent) -
ARI sends a
StasisStartevent to the shim server via WebSocket. -
Shim server performs the following steps:
a. Opens a WebSocket to your voice server.
b. Creates an ARI mixing bridge.
c. Adds the PSTN channel to the bridge.
d. Allocates a UDP port for RTP (10000‑10299; each live call gets its own port).
e. Creates an ExternalMedia channel pointing to that port.
f. Adds the ExternalMedia channel to the bridge. -
Audio flow:
PSTN ↔ Bridge ↔ ExternalMedia ↔ Shim (RTP) ↔ Voice Server (WSS) -
Call termination (caller hangs up or AI ends the call via an ARI tool call):
- ARI sends
ChannelHangupRequest/ChannelDestroyed. - Shim cleans up: closes WebSocket, deletes bridge, releases the RTP port.
- ARI sends
Configuration files
All config files live under deployment/asterisk-server/asterisk-config/. The Docker container mounts this directory.
pjsip.conf – SIP trunk configuration
This is the most important file. It defines the SIP trunk to AWS Chime, including transport settings, TLS certificates, inbound/outbound endpoints, and the
external_signing_addressthat must match the DNS name you created.
(The rest of the repository contains additional config files, Docker Compose files, and example scripts.)
AWS Chime SDK + Asterisk Shim – Quick‑Start Guide
Below is a cleaned‑up version of the original markdown. All headings, code blocks, tables and bullet points have been formatted for readability while preserving the original content.
Overview
| File | Purpose |
|---|---|
| pjsip.conf | SIP transport, TLS settings, and Chime Voice Connector host. |
| extensions.conf | Minimal dialplan – routes calls into the ARI Stasis application (voice‑agent). |
| ari.conf | Credentials for the Asterisk REST Interface (ARI). |
| http.conf | Built‑in HTTP server for ARI (bound to localhost for security). |
| rtp.conf | UDP port range for RTP media streams (default 10000‑10299). |
| modules.conf | Loads only the needed modules: PJSIP, ARI, and the μ‑law codec. |
Notes
external_signaling_addressmust match your DNS name and the TLS certificate.local_nettells Asterisk what is “inside” vs. “outside” for NAT handling.verify_server=nobecause Chime does not present a client certificate.- The cert/key files are what Asterisk presents to Chime during the TLS handshake.
Prerequisites
- AWS account
- EC2 instance (recommended
t3.mediumor larger, Amazon Linux 2023) - Elastic IP attached to the EC2 VM (Chime Voice Connectors require a static IP)
- Domain name with an A record pointing to the Elastic IP
- Docker & Docker Compose installed on the instance
Configure a Chime Voice Connector
- Open the AWS Chime SDK console.
- Create a Voice Connector (or edit an existing one).
| Setting | Value |
|---|---|
| Host | sip.yourdomain.com |
| Port | 5061 |
| Protocol | TLS |
- Note the Voice Connector hostname – you’ll need it for
pjsip.conf.
Obtain a TLS Certificate (Let’s Encrypt)
# Install certbot
sudo yum install -y certbot
# Request a certificate (port 80 must be open)
sudo certbot certonly --standalone \
--preferred-challenges http \
-d sip.yourdomain.com \
--agree-tos -m your@email.com
# Enable automatic renewal
sudo systemctl enable --now certbot-renew.timer
# Create a renewal hook that reloads Asterisk inside Docker
sudo mkdir -p /etc/letsencrypt/renewal-hooks/deploy
sudo tee /etc/letsencrypt/renewal-hooks/deploy/reload-asterisk.sh > /dev/null
The repository also ships a Lambda function that automatically updates the security group whenever AWS publishes new IP ranges for the services AMAZON, EC2, and CHIME_VOICECONNECTOR.
Deploy the Asterisk Server (Docker)
# Change to the Docker deployment directory
cd deployment/asterisk-server
# -------------------------------------------------
# Edit the configuration files as needed:
# - pjsip.conf : domain, cert paths, voice‑connector host
# - ari.conf : secure ARI username/password
# - rtp.conf : adjust port range if required
# -------------------------------------------------
# Start the Asterisk container
docker-compose up -d
# Follow the logs
docker logs -f asterisk-server
# Open an interactive Asterisk CLI
docker exec -it asterisk-server asterisk -rvvvvv
Prepare the Shim Server
Create an .env file
cat > .env <<'EOF'
ARI_BASE=http://127.0.0.1:8088/ari
ARI_USER=ariuser
ARI_PASS=your-secure-password-here
ARI_APP=voice-agent
EXTERNAL_MEDIA_HOST=127.0.0.1
ECS_MEDIA_WSS_URL=wss://your-voice-server.internal/voice/voice
RTP_PORT_START=10000
RTP_PORT_END=10299
EOF
Build & Run the Shim
# Build the shim Docker image
docker build -t asterisk-shim -f deployment/shim-server/Dockerfile .
# Run the shim (host network so it can bind to the RTP ports)
docker run -d --env-file .env --network host --name asterisk-shim asterisk-shim
# Verify the shim health endpoint
curl http://localhost:8080/health
Test the End‑to‑End Flow
-
Call your AWS Chime phone number (the number assigned to the Voice Connector).
-
Watch the logs:
# Asterisk logs (SIP/RTP activity) docker logs -f asterisk-server # Shim server logs (session lifecycle) docker logs -f asterisk-shimYou should see entries similar to:
INVITE received CallSession created ExternalMedia channel established
WebSocket API (Shim ↔ Voice Server)
The API mirrors Twilio Media Streams – same event structure and μ‑law audio format.
Audio Format
| Property | Value |
|---|---|
| Codec | μ‑law (PCMU) |
| Sample rate | 8000 Hz |
| Frame size | 160 bytes (20 ms) |
| Encoding | Base64 |
Event Payloads
start (shim → voice server)
{
"event": "start",
"start": {
"streamSid": "unique-stream-id",
"callSid": "asterisk-channel-id",
"customParameters": {
"source": "asterisk-shim",
"format": "ulaw"
}
}
}
media (bidirectional)
{
"event": "media",
"streamSid": "unique-stream-id",
"media": {
"payload": "base64-encoded-ulaw-audio",
"timestamp": 1234
}
}
clear (voice server → shim)
{ "event": "clear" }
mark (bidirectional)
{
"event": "mark",
"streamSid": "unique-stream-id",
"mark": { "name": "responsePart" }
}
stop (either direction)
{
"event": "stop",
"streamSid": "unique-stream-id"
}
Sample Implementation
A minimal example, voice_agent_server.py, is included in the repository. It demonstrates:
- Handling the WebSocket events above
- Real‑time audio processing