Under the Hood: How Argo Rollouts 1.8 Implements Canary Deployments with Kubernetes 1.33 and Prometheus 3.1
Source: Dev.to
Prerequisites and Stack Compatibility
Argo Rollouts 1.8 is purpose‑built to leverage Kubernetes 1.33’s enhanced workload APIs, including stable support for Deployment and ReplicaSet lifecycle hooks, plus Prometheus 3.1’s native histogram metrics for low‑latency canary analysis. Key compatibility notes:
- Kubernetes 1.33+ is required for Argo Rollouts’ new
Rolloutcontroller admission webhooks, which validate canary configuration at creation time. - Prometheus 3.1’s
prometheus-operatorv0.70+ integration enables automatic metric scraping for canary analysis rules. - Argo Rollouts 1.8 drops support for Kubernetes versions below 1.28, aligning with upstream deprecation policies.
Argo Rollouts 1.8 Canary Architecture
The core Argo Rollouts 1.8 canary workflow relies on three components, updated for K8s 1.33 and Prometheus 3.1:
| Component | Responsibility |
|---|---|
| Rollout Controller | Watches Rollout custom resources (CRs), manages canary ReplicaSet creation, and updates Kubernetes Service and Ingress objects to split traffic. |
| Analysis Controller | Queries Prometheus 3.1 for canary health metrics, evaluates analysis templates, and signals the Rollout Controller to progress or abort the canary. |
| Metrics Server | Aggregates real‑time traffic and error‑rate metrics from K8s 1.33’s kube-proxy and Prometheus 3.1 exporters. |
Under‑the‑Hood Traffic Splitting with Kubernetes 1.33
Kubernetes 1.33 introduces stable support for Service traffic policy enhancements, which Argo Rollouts 1.8 uses to implement canary traffic splitting without third‑party service meshes (though mesh integration is still supported).
Workflow
When a Rollout CR is updated with a new container image, the Rollout Controller:
- Creates a canary
ReplicaSetwith the new image, scaled to 0 replicas initially. - Updates the primary
Serviceselector to include arollout.argoproj.io/canary: "true"label for canary pods androllout.argoproj.io/stable: "true"for stable pods. - Uses K8s 1.33’s
EndpointSliceAPI to split traffic between stable and canaryEndpointSliceobjects based on the canary percentage defined in theRolloutspec.
Example Rollout traffic‑splitting snippet
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: canary-demo
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
- setWeight: 50
- pause: {duration: 10m}
- setWeight: 100
trafficRouting:
kubernetes:
service: canary-demo-svc
ingress:
name: canary-demo-ingress
selector:
matchLabels:
app: canary-demo
template:
metadata:
labels:
app: canary-demo
spec:
containers:
- name: demo-app
image: demo-app:v2.0.0
ports:
- containerPort: 8080
Prometheus 3.1 Integration for Canary Analysis
Argo Rollouts 1.8 leverages Prometheus 3.1’s native histogram and exponential bucket metrics to evaluate canary health with lower query latency than previous versions. The Analysis Controller polls Prometheus 3.1 at configurable intervals using the PrometheusQuery API, then compares results against user‑defined success thresholds.
Example AnalysisTemplate for Prometheus 3.1
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: prometheus-canary-analysis
spec:
metrics:
- name: error-rate
successCondition: result[0] 0.05
provider:
prometheus:
address: http://prometheus.istio-system.svc:9090
query: |
sum(rate(http_requests_total{app="canary-demo", status=~"5.."}[5m])) /
sum(rate(http_requests_total{app="canary-demo"}[5m]))
Prometheus 3.1’s new remote_write optimizations reduce metric lag to under 1 second, ensuring Argo Rollouts 1.8 can make canary progression decisions in near real‑time.
Key Optimizations in Argo Rollouts 1.8
Beyond K8s 1.33 and Prometheus 3.1 integration, Argo Rollouts 1.8 includes several under‑the‑hood improvements:
- Reduced memory footprint – Rollout Controller memory usage drops by ~30 % thanks to K8s 1.33’s shared informer cache optimizations.
- Native support for Prometheus 3.1
exemplarmetrics, enabling trace‑to‑metric correlation for canary debugging. - Improved canary abort logic – If Prometheus 3.1 reports a threshold breach, the Rollout Controller automatically scales down the canary
ReplicaSetand restores 100 % traffic to the stable version within 2 seconds.
Conclusion
Argo Rollouts 1.8, paired with Kubernetes 1.33 and Prometheus 3.1, delivers a robust, low‑latency canary deployment workflow without relying on complex service‑mesh configurations. The tight integration with K8s 1.33’s traffic‑routing APIs and Prometheus 3.1’s high‑performance metrics engine makes it an ideal choice for teams running modern, cloud‑native workloads.
Production Kubernetes workloads.
For full release notes, refer to the Argo Rollouts 1.8 changelog.