Building Production-Grade AI Agents with MCP & A2A: A Guide from the Trenches

Published: (December 24, 2025 at 11:40 PM EST)
8 min read
Source: Dev.to

Source: Dev.to

TL;DR In this article I share my journey from fragile, custom‑built AI‑agent architectures to a robust, standardized approach using the Model Context Protocol (MCP). I’ll walk you through:

  • Why Agent‑to‑Agent (A2A) communication is the missing link in production systems.
  • How I designed a Daily Minutes Assistant using a standardized contract.
  • The exact code infrastructure I built (and how you can too).
  • Why I believe standardizing context is more important than standardizing prompts.

If you’re tired of debugging why your agent hallucinated a function call, this read is for you.

From Chaos to Contract: How I Tamed the Agentic Wild West


TL;DR

In this article I share my journey from fragile, custom‑built AI‑agent architectures to a robust, standardized approach using the Model Context Protocol (MCP). I’ll walk you through:

  • Why Agent‑to‑Agent (A2A) communication is the missing link in production systems.
  • How I designed a Daily Minutes Assistant using a standardized contract.
  • The exact code infrastructure I built (and how you can too).
  • Why I believe standardizing context is more important than standardizing prompts.

If you’re tired of debugging why your agent hallucinated a function call, this read is for you.


Introduction

I still remember the late nights spent debugging my first complex multi‑agent system. I had a Research Agent that was supposed to talk to a Writer Agent. It worked beautifully in my Jupyter notebook, but the moment I deployed it… chaos.

  • The Research Agent output JSON; the Writer Agent expected Markdown.
  • The “Memory” module was a global dictionary that kept getting overwritten.

It was a house of cards.

From my experience, this is where most AI engineering stalls today. We build impressive demos, but production reliability eludes us because we lack a fundamental protocol for communication.

Then I discovered generic protocols like MCP (Model Context Protocol). I realized the problem wasn’t my prompt engineering—it was my architecture. I didn’t need smarter models; I needed better contracts. In my opinion, adopting a strict protocol is the difference between a toy and a tool.


What This Article Is About

This isn’t a high‑level fluff piece about “The Future of AI.” It’s a muddy‑boots, code‑heavy walkthrough of how I built a production‑grade Agent‑to‑Agent (A2A) system.

I’ll show you how I built a Daily Minutes Assistant—a system that:

  1. Connects to my calendar.
  2. Pulls meeting transcripts.
  3. Summarizes them.
  4. Emails me the action items.

The “magic” isn’t the summary; it’s the plumbing.


Tech Stack

I chose this stack because I wanted reliability over hype:

ComponentReason
Python 3.12+Strong typing, async support
MCP SDK (mcp‑python)Core backbone for standardized tool & resource definition
FastAPI / FastMCPRapidly stand up the server interface
PydanticRock‑solid data validation

Why Read It?

If you’ve ever felt the pain of:

  • Writing custom API wrappers for every new tool.
  • Agents getting stuck in loops because they don’t know when to stop.
  • Trying to connect a local specialized agent to a cloud‑based LLM.

…then this experimental PoC I built will speak directly to your soul. I wrote this because I wish someone had shown me this pattern six months ago.


Let’s Design

Before I wrote a single line of code, I stepped back to design the interaction. I thought: “If these agents were employees, how would they pass documents?”

In my view, an agent needs three things:

  1. Tools – Things it can do (search the web, send email).
  2. Resources – Things it can read (calendar, logs).
  3. Prompts – Standardized ways to ask for things.

I sketched out the following flow (simplified):

Client  →  FastMCP Server (exposes tools/resources)  →  LLM

I designed it this way because I wanted the SummaryServer to be completely ignorant of the ResearchServer. Decoupling them lets me swap out the research engine later without breaking the calendar integration.


Let’s Get Cooking

Below is the actual implementation. I structured the project to separate the Server (which exposes capabilities) from the Client (which consumes them).

Step 1 – The FastMCP Server

import os
from mcp.server.fastmcp import FastMCP

# Initialize the FastMCP server.
# In my opinion, naming your server clearly is crucial for mult‑agent debugging.
mcp = FastMCP("DailyAssistant")

@mcp.tool()
async def search_web(query: str, limit: int = 5) -> str:
    """
    Search the web for a given query.

    Args:
        query: The search query.
        limit: Max results to return.
    """
    # In a real deployment this would call Tavily, Serper, etc.
    # For this PoC we mock the return to focus on the protocol.
    return (
        f"Mock search results for '{query}':\n"
        "1. Result A\n"
        "2. Result B"
    )

@mcp.resource("config://app_settings")
def get_app_settings() -> str:
    """Get application configuration settings."""
    return "Theme: Dark\nNotifications: Enabled"

What This Does

  • Defines a server that offers a search_web tool and a config://app_settings resource.

Why I Structured It This Way

  • Decorators (@mcp.tool()) keep the definition close to the implementation.
  • Defining tools in a separate JSON file often leads to drift where the implementation changes but the schema does not.

What I Learned

  • The type hints (query: str) aren’t just for show. MCP uses them to generate the JSON schema that the LLM eventually sees. If you’re lazy with types here, your agent will be confused later.

Step 2 – The Agent Client

import asyncio
import sys
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_client():
    # I decided to use Stdio for local communication.
    # It’s faster and secure by default for side‑car patterns.
    server_params = StdioServerParameters(
        command=sys.executable,
        args=["src/server/agent_server.py"],
        env=None,
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Example: ask the server to perform a web search.
            result = await session.call_tool(
                "search_web",
                {"query": "MCP protocol overview", "limit": 3},
            )
            print("Search result:", result)

if __name__ == "__main__":
    asyncio.run(run_client())

What This Does

  • Spins up a local stdio subprocess that runs agent_server.py.
  • Uses ClientSession to invoke the search_web tool without caring how it’s implemented.

Why It Matters

  • The client only needs the contract (tool name + JSON schema).
  • Swapping the server implementation (e.g., moving to a remote FastAPI endpoint) requires only a change to StdioServerParameters.

Step 3 – Wiring It All Together (FastAPI Wrapper)

If you prefer an HTTP‑based server instead of stdio, the same decorators work with FastAPI:

from fastapi import FastAPI
from mcp.server.fastapi import FastAPIMCP

app = FastAPI()
mcp = FastAPIMCP(app, "DailyAssistant")

@mcp.tool()
async def send_email(to: str, subject: str, body: str) -> str:
    """
    Mock email sender.
    """
    # In production you’d integrate with an SMTP service or SendGrid.
    return f"Email sent to {to} with subject '{subject}'."

Running uvicorn this_module:app --reload now exposes the send_email tool via the MCP protocol over HTTP.


Lessons Learned

LessonExplanation
Contracts trump promptsA well‑defined schema (tool name, args, return type) prevents the LLM from “hallucinating” calls.
Decouple producers & consumersKeep the server ignorant of who is calling it. This enables independent versioning.
Type hints are your friendsMCP auto‑generates JSON schemas from Python type hints. Missing or wrong hints = broken agents.
Standardize context, not just outputSharing a common context (e.g., calendar IDs, user preferences) across agents reduces duplication and errors.
Prefer local IPC for side‑carsstdio is fast, secure, and avoids network latency when both processes run on the same host.

TL;DR (Revisited)

  • Use MCP to define tools and resources with Python decorators.
  • Keep agents thin: they only need to know what they can call, not how it’s implemented.
  • Decouple services → swap implementations without breaking contracts.

Final Thoughts

Standardizing context via a protocol like MCP turned my flaky demo into a production‑ready system. The effort spent on a clean contract paid off many times over: fewer bugs, easier debugging, and the ability to scale agents independently.

If you’re building anything beyond a single‑agent proof‑of‑concept, give MCP a try. Your future self (and your ops team) will thank you.

Session Example

async with MCPClientSession(read, write) as session:
    await session.initialize()

    # Dynamic Discovery
    tools = await session.list_tools()
    print(f"Connected! Found tools: {[t.name for t in tools.tools]}")

    # Execution
    result = await session.call_tool(
        "search_web",
        arguments={"query": "MCP adoption"}
    )
    print(f"Tool Output: {result.content[0].text}")

What This Does

It launches the server as a subprocess and connects via standard input/output. It then dynamically asks “What can you do?” (list_tools) before asking it to do something.

My Experience Here

I initially tried to use HTTP for everything, but for local agents—like a coding assistant running on my laptop—stdio is vastly superior. It has zero network overhead and simplifies the auth story (if you can run the process, you have access).


Let’s Set Up

If you want to run this PoC yourself, I’ve kept it dead simple.

Prerequisites

  • Python 3.10+
  • A virtual environment (always use venvs!)

Clone the Repository

(Link to the public repo is provided below.)

Install Dependencies

pip install mcp httpx

Verify Installation

python -c "import mcp; print(mcp.__version__)"

Let’s Run

Running the system is straightforward.

python src/client/agent_client.py

You should see output indicating the handshake succeeded, followed by the mock search result.

What to Watch For

If you see a generic ConnectionRefused or a pipe error, it’s usually because the server script crashed on startup (e.g., missing imports) before the handshake could complete. Always verify your server runs standalone first!


Closing Thoughts

Building this experimental “Daily Minutes Assistant” taught me that the future of AI isn’t just about bigger context windows—it’s about structured context.

In my view, we are moving from “Prompt Engineering” to Context Engineering. The MCP approach lets us treat tools and resources as first‑class citizens, bridging the gap between flashy demos and reliable production software.

I hope this guide saves you some of the headaches I faced. The code is available, so fork it, break it, and let me know what you build.

Tags: ai, python, mcp, agents


Disclaimer

The views and opinions expressed here are solely my own and do not represent the views, positions, or opinions of my employer or any organization I am affiliated with. The content is based on my personal experience and experimentation and may be incomplete or incorrect. Any errors or misinterpretations are unintentional, and I apologize in advance if any statements are misunderstood or misrepresented.

Back to Blog

Related posts

Read more »