Show HN: Airbyte Agents – context for agents across multiple data sources

Published: 5 days ago (May 5, 2026 at 11:03 AM EDT)

3 min read

Source: Hacker News

Introduction

I’m Michel, co‑founder and CEO of Airbyte (airbyte.com). After six years of building data connectors, we’re launching Airbyte Agents (docs.airbyte.com/ai-agents), a unified data layer that lets agents discover information and take action across operational systems.

Quick walkthrough: YouTube video

The Problem

As agents move into real workflows, they need to interact with many tools (e.g., Slack, Salesforce, Linear). This introduces a lot of API plumbing: authentication, pagination, filters, schema handling, and entity matching across systems.

Most Managed Cloud Platforms (MCPs) are thin wrappers over APIs, so agents inherit weak primitives and often get it wrong, especially when working across tools.

A deeper issue is that APIs assume you already know what to query (endpoints, object IDs, fields). Agents usually start one step earlier: they must first discover what matters before they can reason.

Airbyte Agents: A Context Layer

Airbyte Agents act as a Context Store, a data index optimized for agentic search and populated by our replication connectors. This gives agents a structured way to discover data while still allowing direct read/write to upstream systems when needed.

Why It Matters

Reduces the number of API calls an agent must make.
Provides a pre‑indexed view of data, improving answer quality and speed.
Handles entity matching and schema translation centrally.

Example Use Cases

“Show me all enterprise deals closing this month with open support tickets.”
“Find every support ticket that doesn’t have a GitHub issue opened.”

These queries sound simple, but the quality of the answer improves dramatically when the agent doesn’t have to assemble all that context at runtime.

Benchmark

I built a benchmark harness (public on GitHub: airbytehq/airbyte-agents-benchmarks) to compare calling the Airbyte Agent MCP versus calling vendor MCPs directly.

Metric: token consumption (proxy for agent efficiency).

Results

Vendor	Token reduction vs. native MCP
Gong	up to 80% fewer tokens
Zendesk	up to 90% fewer tokens
Linear	up to 75% fewer tokens
Salesforce	up to 16% fewer tokens (Salesforce’s SOQL already efficient)

The benchmark is intentionally simple, using token usage as a proxy for how efficiently an agent reaches a correct answer.

Call for Feedback

We’re early in development and some parts are rough, but we’d love input from the community:

Are you indexing data ahead of time, or letting the agent call APIs live?
How are you matching entities across systems?

Feel free to poke at the benchmark harness and share any thoughts, comments, or ideas on how we can improve Airbyte Agents.

We’re excited to keep building!

Show HN: Airbyte Agents – context for agents across multiple data sources

Introduction

The Problem

Airbyte Agents: A Context Layer

Why It Matters

Example Use Cases

Benchmark

Results

Call for Feedback

Related posts

All Those A.I. Note Takers? They're Making Lawyers Nervous

The Greatest Shot in Television: James Burke Had One Chance to Nail This Scene (2024)

Show HN: adamsreview – better multi-agent PR reviews for Claude Code

Running local models on an M4 with 24GB memory