Bringing Async MCP to Google Cloud Run — Introducing cloudrun-mcp

Published: (February 22, 2026 at 09:19 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

When you design distributed AI or agentic workloads on Google Cloud’s Cloud Run, you often juggle three recurring problems:

  • How to authenticate workloads securely
  • How to maintain long-lived, event‑driven sessions
  • How to stream model context data efficiently without blocking threads

cloudrun-mcp solves all three in one lightweight Python SDK.

What is MCP (Model Context Protocol)?

MCP — Model Context Protocol is an emerging open standard for exchanging context between AI models, tools, and environments.

Think of it as “WebSockets for AI knowledge.”
Instead of hardcoding API calls, your model connects to an MCP server and streams structured events such as:

  • context.create
  • document.attach
  • agent.reply

For developers deploying AI agents on Cloud Run, GKE, or hybrid workloads, an async client is essential for scalability.

Introducing cloudrun-mcp

Async MCP (Model Context Protocol) client for Cloud Run.

Built by Raghava Chellu (February 2026), cloudrun-mcp brings:

  • First‑class async streaming
  • Automatic Cloud Run authentication
  • Agentic‑AI‑friendly APIs

to your production workloads.

How It Works

Under the hood:

  • The client uses aiohttp to maintain an HTTP/1.1 keep‑alive streaming session.
  • Inside Cloud Run, it queries the metadata service to obtain a signed JWT:
GET http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity?audience=
  • Each event from the MCP server arrives as a Server‑Sent Event (SSE).
  • The SDK yields events as a Python async iterator, ready for real‑time AI reasoning loops.

Installation

pip install cloudrun-mcp

Requirements

  • Python ≥ 3.10
  • Deployed on GCP (Cloud Run / GKE / GCE) with metadata‑server access

Usage Example

import asyncio
from cloudrun_mcp import MCPClient

async def main():
    client = MCPClient(base_url="https://your-mcp-server.run.app")
    async for event in client.events():
        print(event)

asyncio.run(main())

Typical Output Stream

{"event":"context.create","status":"ok"}
{"event":"model.response","content":"42"}
{"event":"model.done"}

That’s it — you’ve connected an async agent running on Cloud Run to an MCP backend and are receiving real‑time context updates.

Why Async MCP Matters

AI workloads are evolving from simple request‑response APIs to long‑running reasoning graphs.
Synchronous I/O becomes a bottleneck.

cloudrun-mcp leverages Python’s asyncio to keep event loops responsive across:

  • Streaming token generation
  • Function‑calling orchestration
  • Multi‑model chains

It’s especially powerful for Agentic AI, where orchestrators consume continuous model context (tool outputs, planning updates, memory events).

Authentication Deep Dive

The SDK automatically:

  1. Discovers the metadata endpoint.
  2. Retrieves an ID token targeting your MCP server.
  3. Injects it into request headers:
Authorization: Bearer 
  • Refreshes tokens every ~55 minutes.
  • No OAuth flows.
  • No key.json files.

Perfect for production micro‑agents.

Streaming with Back‑Pressure Control

async for event in client.events(buffer=32):
    await handle_event(event)

Typical Deployment Pattern

[MCP Clients]  [cloudrun-mcp SDK]  [Cloud Run Service]
         \
          ↳ [Agent Processors / Vector DB / PubSub Pipelines]

cloudrun-mcp acts as the async bridge between Cloud identity and AI reasoning streams.

Real‑World Use Cases

  • Event‑Driven AI Agents – Agents listening to MCP streams and triggering workflows automatically.
  • LLM Orchestration Pipelines – Streaming intermediate reasoning steps to dashboards.
  • IoT Telemetry Ingestion – Continuous SSE device streams pushed to Pub/Sub.
  • Hybrid Edge Inference – Bridge local MCP hubs with Cloud Run decision services.

Design Philosophy

The SDK follows three principles:

  • Async First — built entirely on asyncio
  • Zero Secrets — uses Workload Identity exclusively
  • Agentic Friendly — integrates with frameworks like LangChain or CrewAI
0 views
Back to Blog

Related posts

Read more »

A Discord Bot that Teaches ASL

This is a submission for the Built with Google Gemini: Writing Challengehttps://dev.to/challenges/mlh/built-with-google-gemini-02-25-26 What I Built with Google...

AWS who? Meet AAS

Introduction Predicting the downfall of SaaS and its providers is a popular theme, but this isn’t an AWS doomsday prophecy. AWS still commands roughly 30 % of...