How We Built MCP Support in Bifrost (And What We Learned About Agent Security)
Source: Dev.to
When we started building MCP support for Bifrost…
I thought it would be straightforward: connect to MCP servers, proxy tool calls, done.
Turns out, making this production‑ready meant solving problems that the MCP spec doesn’t even address.
Bifrost
The fastest way to build AI applications that never go down
Bifrost is a high‑performance AI gateway that unifies access to 15+ providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, and more) through a single OpenAI‑compatible API. Deploy in seconds with zero configuration and get automatic failover, load balancing, semantic caching, and enterprise‑grade features.
Quick Start
Go from zero to a production‑ready AI gateway in under a minute.
Step 1 – Start Bifrost Gateway
Install and run locally
npx -y @maximhq/bifrost
Or use Docker
docker run -p 8080:8080 maximhq/bifrost
Step 2 – Configure via Web UI
# Open the built‑in web interface
open http://localhost:8080
Step 3 – Make your first API call
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello, Bifrost!"}]
}'
That’s it! Your AI gateway is running with a web interface for visual configuration, real‑time monitoring, …
The Problem with AI Agents
AI agents hit a hard limit when you try to connect them to real infrastructure.
You want your agent to query a database, read files, or call an API? You end up writing custom integration code for every single use case.
Model Context Protocol (MCP) changes this. It’s a standard way for AI models to discover and use external tools at runtime. Instead of hard‑coding integrations, you spin up MCP servers that expose tools — and the AI model just figures out what’s available and uses it.
Three big problems we observed in production deployments
- Security disasters waiting to happen – Most implementations let the AI model execute any tool it wants. No oversight, no logging. One bad prompt can update production databases or delete files.
- Zero observability – When something breaks, you have no idea which tool was called, what parameters were used, or what failed. Debugging becomes archaeology.
- Operational complexity – Managing connections to dozens of MCP servers, handling failures, controlling access – none of this is solved by the protocol itself.
Why Bifrost includes MCP support
We built MCP support directly into Bifrost. It’s not just a client – it’s a complete control plane for production‑grade agent infrastructure.
We support four ways to connect to MCP servers, each solving different problems:
| Connection type | Description | Typical latency | Ideal use‑case |
|---|---|---|---|
| In‑process | Run tools directly in Bifrost’s memory. Register typed handlers in Go. | ~0.1 ms (no network) | Internal business logic that doesn’t need a separate process. |
| Local process | Launch external processes and talk via stdin/stdout. Works well for Python/Node.js MCP servers. | 1‑10 ms | Filesystem operations, scripts, or any local tooling. |
| Remote HTTP | Talk to remote MCP servers over HTTP. | 10‑500 ms (network dependent) | Scalable micro‑services, database tools, authenticated APIs. |
| Server‑Sent Events (SSE) | Persistent connection that streams real‑time updates. | Depends on source | Monitoring tools, live dashboards, event‑driven data. |
Example: In‑process tool registration (Go)
type CalculatorArgs struct {
Operation string `json:"operation"`
A float64 `json:"a"`
B float64 `json:"b"`
}
func calculatorHandler(args CalculatorArgs) (string, error) {
switch args.Operation {
case "add":
return fmt.Sprintf("%.2f", args.A+args.B), nil
case "multiply":
return fmt.Sprintf("%.2f", args.A*args.B), nil
default:
return "", fmt.Errorf("unsupported operation")
}
}
// Register the tool with Bifrost
client.RegisterMCPTool(
"calculator",
"Perform arithmetic",
calculatorHandler,
schema,
)
Compile‑time type checking catches bugs before runtime.
Example: Remote HTTP tool (database)
# Bifrost configuration snippet
ToolManagerConfig: &schemas.MCPToolManagerConfig{
ToolsToAutoExecute: []string{
"filesystem/read_file",
"database/query_readonly",
},
ToolExecutionTimeout: 30 * time.Second,
}
Only the tools you explicitly approve run automatically. Everything else requires manual approval, preventing disasters while keeping agents fast for safe operations.
Fine‑grained tool exposure
You can filter which tools are available per request:
curl -X POST http://localhost:8080/v1/chat/completions \
-H "x-bf-mcp-include-clients: secure-database,audit-logger" \
-H "x-bf-mcp-include-tools: secure-database/*,audit-logger/log_access" \
-d '{"model": "gpt-4o-mini", "messages": [...]}'
- The agent only sees the tools you specify.
- Customer‑facing chatbots don’t get access to admin tools.
- Financial applications can restrict agents to read‑only operations.
- Wildcards are supported (
database/*includes all database tools).
This solves the privilege‑escalation problem by giving you per‑request, per‑tool control.
Security model
By default, Bifrost treats tool calls as suggestions, not commands. When an AI model wants to use a tool, it returns the request to your application, and you decide whether to execute it. This gives you human oversight for anything dangerous.
If you need fully automated execution, enable agent mode and whitelist the safe tools (as shown in the configuration example above).
Bottom line
- Observability – Every tool call is logged and can be traced.
- Security – Human‑in‑the‑loop by default; optional whitelisting for safe automation.
- Operational simplicity – One gateway, four connection modes, unified configuration.
Bifrost + MCP = production‑ready AI agents that are fast, observable, and secure.
Execution Patterns
We built two execution patterns because different use cases need different approaches:
1. Agent mode
- The AI calls one tool at a time.
- It sees the result, thinks, then calls the next tool.
- Ideal for interactive agents where you want visibility at each step.
2. Code mode
- The AI writes TypeScript that orchestrates multiple tools in a single script.
const files = await listFiles("/project");
const results = await Promise.all(
files.map(file => analyzeFile(file))
);
return summarize(results);
- The code runs atomically – much faster for batch operations.
- Lower latency because you avoid multiple LLM round‑trips.
- Trade‑off: you lose per‑tool approval – the code either runs or it doesn’t.
Typical usage
- Code mode – data‑analysis workflows that need to read dozens of files, run queries, and generate reports.
- Agent mode – customer‑support scenarios where we want to review database queries before they execute.
Tool Discovery & Communication
- Tool discovery is faster than expected – retrieving a list of 50+ tools from an MCP server takes under 100 ms.
- STDIO latency is surprisingly low: only 1–2 ms overhead, barely noticeable.
- HTTP adds considerably more latency due to network round‑trips.
Type Safety & Request Filtering
- Type safety prevents many bugs. In‑process tools using Go structs catch issues at compile time that would otherwise become runtime failures with JSON validation.
- Request‑level filtering is more powerful than a global config. Being able to change which tools are available per request gives far more control than static configuration.
Real‑World Performance Numbers
| Component | Avg. Latency | Notes |
|---|---|---|
| STDIO filesystem tools | 1.2 ms | |
| HTTP database tools | 15 ms | Local network |
| In‑process calculation tools | 0.08 ms | |
| Memory per STDIO connection | ~2 MB | |
| Tool discovery (50 tools) | 85 ms | Go runtime; Python/Node.js MCP servers will be slower, but the architecture scales well. |
MCP Support
- MCP support is live in Bifrost.
- Setup is straightforward: configure your MCP servers, set which tools can auto‑execute, and start making requests.
Documentation
- We documented all four connection types, agent vs. code mode, and request‑level filtering in the MCP docs.
- Examples are provided for common patterns:
- Filesystem
- Database
- Web APIs
Closing Thoughts
If you’re building agents that need to interact with real infrastructure, this approach is cleaner than custom integration code and gives you the observability and security controls required for production.
The code is open source – check out the implementation if you’re curious about the internals. We ship updates based on real‑world usage, so feedback is always welcome.