Show HN: Airbyte Agents – context for agents across multiple data sources
Source: Hacker News
Introduction
I’m Michel, co‑founder and CEO of Airbyte (airbyte.com). After six years of building data connectors, we’re launching Airbyte Agents (docs.airbyte.com/ai-agents), a unified data layer that lets agents discover information and take action across operational systems.
Quick walkthrough: YouTube video
The Problem
As agents move into real workflows, they need to interact with many tools (e.g., Slack, Salesforce, Linear). This introduces a lot of API plumbing: authentication, pagination, filters, schema handling, and entity matching across systems.
Most Managed Cloud Platforms (MCPs) are thin wrappers over APIs, so agents inherit weak primitives and often get it wrong, especially when working across tools.
A deeper issue is that APIs assume you already know what to query (endpoints, object IDs, fields). Agents usually start one step earlier: they must first discover what matters before they can reason.
Airbyte Agents: A Context Layer
Airbyte Agents act as a Context Store, a data index optimized for agentic search and populated by our replication connectors. This gives agents a structured way to discover data while still allowing direct read/write to upstream systems when needed.
Why It Matters
- Reduces the number of API calls an agent must make.
- Provides a pre‑indexed view of data, improving answer quality and speed.
- Handles entity matching and schema translation centrally.
Example Use Cases
- “Show me all enterprise deals closing this month with open support tickets.”
- “Find every support ticket that doesn’t have a GitHub issue opened.”
These queries sound simple, but the quality of the answer improves dramatically when the agent doesn’t have to assemble all that context at runtime.
Benchmark
I built a benchmark harness (public on GitHub: airbytehq/airbyte-agents-benchmarks) to compare calling the Airbyte Agent MCP versus calling vendor MCPs directly.
Metric: token consumption (proxy for agent efficiency).
Results
| Vendor | Token reduction vs. native MCP |
|---|---|
| Gong | up to 80% fewer tokens |
| Zendesk | up to 90% fewer tokens |
| Linear | up to 75% fewer tokens |
| Salesforce | up to 16% fewer tokens (Salesforce’s SOQL already efficient) |
The benchmark is intentionally simple, using token usage as a proxy for how efficiently an agent reaches a correct answer.
Call for Feedback
We’re early in development and some parts are rough, but we’d love input from the community:
- Are you indexing data ahead of time, or letting the agent call APIs live?
- How are you matching entities across systems?
Feel free to poke at the benchmark harness and share any thoughts, comments, or ideas on how we can improve Airbyte Agents.
We’re excited to keep building!