Introducing Virtual MCP Server: Unified Gateway for Multi-MCP Workflows
Source: Dev.to
The problem: connection overload
Picture this: you’re an engineer on a platform team. Your AI assistant needs access to GitHub for code, Jira for tickets, Slack for notifications, PagerDuty for incidents, Datadog for metrics, AWS for infrastructure, Confluence for docs, and your internal knowledge base. That’s 8 separate MCP server connections, each exposing 10‑20+ tools. Now your AI’s context window is filling up with 80+ tool descriptions, burning tokens and degrading performance as the LLM struggles to select the right tools from an overwhelming list.
Each MCP server connection requires:
- Individual configuration in your AI client
- Separate authentication credentials
- Manual coordination when tasks span multiple systems
- Repeated parameter entry (same repo, same channel, same database)
- Tool filtering to avoid context bloat and wasted tokens
Want to investigate a production incident? You’re manually running commands across 4 different systems and piecing together the results yourself. Deploying an app? You’re orchestrating a sequence of operations: merge PR, wait for CI, get approval, deploy, notify team. It’s tedious, error‑prone, and not reusable.
The solution: aggregate everything
vMCP transforms those 8 connections into one. You configure a single MCP endpoint that aggregates all your backend servers.
Before vMCP:
{
"servers": {
"github": { "url": "..." },
"jira": { "url": "..." },
"slack": { "url": "..." },
"pagerduty": { "url": "..." },
"datadog": { "url": "..." },
"aws": { "url": "..." },
"confluence": { "url": "..." },
"docs": { "url": "..." }
}
}
With vMCP:
{
"servers": {
"company-tools": {
"url": "http://vmcp.company.com/mcp"
}
}
}
One connection. One authentication flow. All your tools available.
You can run as many vMCP instances as you need. Your frontend team connects to one vMCP with their specific tools; your platform team connects to another with infrastructure access. Each vMCP aggregates exactly the backends that each team needs, with appropriate security policies and permissions. This improves security (no more giving everyone access to everything) and efficiency (fewer tools means smaller context windows, lower token costs, and better AI performance).
What vMCP does
vMCP is part of the ToolHive Kubernetes Operator. It acts as an intelligent aggregation layer that sits between your AI client and your backend MCP servers.
1. Multi‑server aggregation with tool filtering
All MCP tools appear through a single endpoint, but you cherry‑pick exactly which tools to expose.
- Example: an engineer on the ToolHive team gets a single vMCP connection with:
- GitHub’s
search_codetool (scoped to thestacklok/toolhiverepo only) - The ToolHive docs MCP server
- An internal docs server hooked up to Google Drive and filtered to ToolHive design docs
- Slack (only the
#toolhive-teamchannel)
- GitHub’s
No irrelevant tools clutter the LLM’s context, and no tokens are wasted on unused tool descriptions.
When multiple MCP servers have tools with the same name (e.g., both GitHub and Jira have create_issue), vMCP automatically prefixes them (github_create_issue, jira_create_issue). You can customize these names as needed.
2. Declarative multi‑system workflows
Real tasks often require coordinating across multiple systems. vMCP lets you define deterministic workflows that execute in parallel with conditionals, error handling, and approval gates.
Incident investigation workflow
→ Query logs from logging system
→ Fetch metrics from monitoring platform
→ Pull traces from tracing service
→ Check infrastructure status from cloud provider
→ Manually combine everything into a report
→ Create Jira ticket with findings
vMCP runs the queries in parallel, aggregates the data, and creates the ticket. Define the workflow once and reuse it for every incident.
App deployment workflow
→ Merge pull request in GitHub
→ Wait for CI tests to pass
→ Request human approval (using MCP elicitation)
→ Deploy (only if approved)
→ Notify team in Slack
3. Pre‑configured defaults and guardrails
Stop typing the same parameters repeatedly. Configure defaults once in vMCP.
Before: Every GitHub query requires specifying repo: stacklok/toolhive
After: The repo is pre‑configured, preventing accidental queries to the wrong repository.
Pre‑configuring parameters ensures deterministic behavior, security, and consistency across all users.
4. Tool customization and security policies
Third‑party MCP servers often expose generic, unrestricted tools. vMCP lets you wrap and restrict them without modifying upstream servers.
- Security policy enforcement – Restrict a website‑fetch tool to internal domains only (
*.company.com), validate URLs before calling the backend, and provide clear error messages for violations. - Simplified interfaces – Wrap a complex AWS EC2 tool that has 20+ parameters, exposing only the three parameters your frontend team needs, with safe defaults for everything else.
5. Centralized authentication
vMCP implements a two‑boundary authentication model with a complete audit trail. Your AI client authenticates once to vMCP using the OAuth 2.1 methods defined in the official MCP spec. vMCP then handles authorization to each backend independently based on its requirements.
When you need to revoke access, disable the user in your identity provider and all backend access is revoked instantly.
Real‑world benefits
Without vMCP
- 4 sequential manual commands
- 2‑3 minutes per command
- 5‑10 minutes aggregating and formatting
- 15‑20 minutes total per incident
- Results vary by engineer; process isn’t documented or reusable
With vMCP
- One command triggers the workflow
- Parallel execution: ~30 seconds
- Automatic aggregation and formatting
- Consistent results every time
- Workflow is documented as code; any team member can use it
For a team handling 20 incidents per week, the time savings translate to roughly 5 hours saved weekly, along with reduced token costs and more reliable incident response.
