Announcing the AI Gateway Working Group
Source: Kubernetes Blog
Introduction
The Kubernetes community includes many Special Interest Groups (SIGs) and Working Groups (WGs) that facilitate discussions on important topics. Today we’re excited to announce the formation of the AI Gateway Working Group, a new initiative focused on developing standards and best practices for networking infrastructure that supports AI workloads in Kubernetes environments.
What is an AI Gateway?
In a Kubernetes context, an AI Gateway refers to network gateway infrastructure (e.g., proxy servers, load balancers) that implements the Gateway API specification with enhanced capabilities for AI workloads. AI Gateways are not a distinct product category; they describe infrastructure designed to enforce policy on AI traffic, including:
- Token‑based rate limiting for AI APIs
- Fine‑grained access controls for inference APIs
- Payload inspection enabling intelligent routing, caching, and guardrails
- Support for AI‑specific protocols and routing patterns
Working Group Charter and Mission
The AI Gateway Working Group operates under a clear charter with the mission to develop proposals for Kubernetes SIGs and their sub‑projects.
Primary goals
- Standards Development – Create declarative APIs, standards, and guidance for AI workload networking in Kubernetes.
- Community Collaboration – Foster discussions and build consensus around best practices for AI infrastructure.
- Extensible Architecture – Ensure composability, pluggability, and ordered processing for AI‑specific gateway extensions.
- Standards‑Based Approach – Build on established networking foundations, layering AI‑specific capabilities on top of proven standards.
Active Proposals
Payload Processing
The payload processing proposal addresses the need for AI workloads to inspect and transform full HTTP request and response payloads. It enables:
- AI Inference Security – Guard against malicious prompts and prompt‑injection attacks, content filtering for AI responses, signature‑based detection, and anomaly detection for AI traffic.
- AI Inference Optimization – Semantic routing based on request content, intelligent caching to reduce inference costs and improve response times, and Retrieval‑Augmented Generation (RAG) system integration for context enhancement.
The proposal defines standards for declarative payload processor configuration, ordered processing pipelines, and configurable failure modes—essential for production AI workload deployments.
Egress Gateways
Modern AI applications increasingly depend on external inference services for specialized models, failover, or cost optimization. The egress gateways proposal aims to define standards for securely routing traffic outside the cluster.
Key features
- External AI Service Integration – Secure access to cloud‑based AI services (OpenAI, Vertex AI, Bedrock, etc.), managed authentication and token injection for third‑party AI APIs, regional compliance, and failover capabilities.
- Advanced Traffic Management – Backend resource definitions for external FQDNs and services, TLS policy management and certificate‑authority control, cross‑cluster routing for centralized AI infrastructure.
User Stories We’re Addressing
- Platform operators providing managed access to external AI services.
- Developers requiring inference failover across multiple cloud providers.
- Compliance engineers enforcing regional restrictions on AI traffic.
- Organizations centralizing AI workloads on dedicated clusters.
Upcoming Events
KubeCon + CloudNativeCon Europe 2026, Amsterdam
The AI Gateway Working Group will present at KubeCon + CloudNativeCon Europe, discussing challenges at the intersection of AI and networking, the group’s active proposals, and the relationship between AI gateways, the Model Context Protocol (MCP), and agent networking patterns.
Get Involved
The AI Gateway Working Group represents the Kubernetes community’s commitment to standardizing AI workload networking. As AI becomes integral to modern applications, robust, standardized infrastructure is needed to support inference workloads while maintaining security, observability, and reliability.
Whether you’re a gateway implementer, platform operator, AI application developer, or simply interested in the intersection of Kubernetes and AI, we welcome your input. The working group follows an open contribution model—you can review proposals, join weekly meetings, or start discussions on GitHub.
Ways to contribute
- Visit the working group’s umbrella GitHub repository.
- Read the working group’s charter.
- Join the weekly meeting on Thursdays at 2 PM EST.
- Connect on Slack (
#wg-ai-gateway; request an invitation at ). - Subscribe to the AI Gateway mailing list.
The future of AI infrastructure in Kubernetes is being built today. Join us and help shape AI‑aware gateway capabilities in Kubernetes.