We Open-Sourced Our AI Calling Framework (So You Don't Waste 2-3 Months)

Published: (January 17, 2026 at 05:32 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

Siphon

Three months.
That’s how long many teams spend building telephony infrastructure before writing a single line of actual conversation logic for an AI voice agent.

Not because the AI was hard.
Because telephony is brutal.

Today, we’re open‑sourcing the solution so you don’t have to go through the same pain.

The Hidden Problem with AI Calling Agents

Building an AI calling agent sounds straightforward:

  • Use an LLM
  • Add speech‑to‑text
  • Add text‑to‑speech
  • Connect it to a phone number

In reality, that’s where most teams hit a wall. To make real phone calls, you end up dealing with:

  • SIP trunks & PSTN providers
  • Low‑latency, bidirectional audio
  • Real‑time orchestration of STT, LLM, and TTS
  • Call state, interruptions, transfers
  • Scaling, monitoring, recordings, persistence

The result? Most teams spend weeks or months on infrastructure before they ever touch the conversation itself.

We did too. And eventually asked:

“Why is building voice AI still this hard?”

Introducing Siphon

Siphon is an open‑source Python framework that handles the telephony complexity for you, so you can focus on building great conversations.

Here’s what a complete AI receptionist looks like with Siphon:

from siphon.agent import Agent
from siphon.plugins import openai, cartesia, deepgram

agent = Agent(
    agent_name="receptionist",
    llm=openai.LLM(model="gpt-4"),
    tts=cartesia.TTS(voice="helpful-assistant"),
    stt=deepgram.STT(model="nova-2"),
    system_instructions="""
    You are a friendly receptionist for Acme Corp.
    Help callers schedule appointments or route them correctly.
    """
)

if __name__ == "__main__":
    agent.start()

Run this, and your agent can answer real phone calls via any SIP provider (Twilio, Telnyx, etc.).

What Siphon Handles for You

  • 🔌 SIP & PSTN connectivity – Works with any SIP provider, no FreeSWITCH pain.
  • Real‑time audio pipeline – Built on LiveKit with streaming audio and sub‑500 ms voice‑to‑voice latency.
  • 🤖 AI orchestration – Plug‑and‑play support for LLMs, STT, and TTS.

Swap providers with a single line:

llm=anthropic.LLM(model="claude-3-5-sonnet")
  • 📈 Production‑ready by default – Auto‑scaling, call recordings, transcripts, state handling, and observability.

Quick Start

Install the package:

pip install siphon-ai

Create an agent:

from siphon.agent import Agent
from siphon.plugins import openai, cartesia, deepgram

agent = Agent(
    agent_name="my_first_agent",
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    stt=deepgram.STT(),
    system_instructions="You are a helpful assistant.",
)

agent.start()

That’s it. Your agent is live and answering phone calls. (Full setup, outbound calling, and advanced examples are in the docs.)

Why We Open‑Sourced It

We could have kept Siphon proprietary or turned it into a closed SaaS, but we believe voice AI shouldn’t be locked behind massive infrastructure effort.

Siphon is:

  • Apache 2.0 licensed
  • Provider‑agnostic
  • Fully self‑hostable
  • No vendor lock‑in

Use it commercially, modify it, or build on top of it.

What You Can Build

  • 📞 Customer support agents
  • 📅 Appointment scheduling
  • 💼 Sales qualification
  • 📊 Surveys & feedback collection
  • 🏥 Healthcare intake systems

If it involves phone calls and conversations, Siphon handles the hard parts.

Get Involved

  • ⭐ GitHub:
  • 📖 Docs:
  • 🐛 Issues & feature requests welcome
  • 🤝 PRs encouraged

We’re building Siphon in public and would love community feedback. If you’ve ever thought “I wish building AI calling agents was simpler”—give Siphon a try.

Back to Blog

Related posts

Read more »