AWS re:Invent 2025 - Architecting resilient multicloud operations, feat. Monzo Bank (HMC201)

Published: 1 week ago (December 6, 2025 at 07:39 AM EST)

3 min read

Source: Dev.to

Overview

AWS Principal Technologists Clark Richey and Bruno Emer, together with Monzo Bank’s Andrew Lawson, discuss strategies for building resilient multi‑cloud operations. They introduce the SEEMS framework for analyzing failure modes and share Monzo’s “Stand‑in” platform—a lightweight, cost‑effective backup system that runs on Google Cloud while the primary platform operates on AWS.

The SEEMS Framework

Acronym	Failure Mode	Description
S – Single points of failure	Components whose loss would bring down the entire system.
E – Excessive load	Situations where traffic overwhelms a service, causing degradation or outage.
E – Excessive latency	High response times that can break user experiences or downstream dependencies.
M – Misconfiguration / bugs	Human errors or software defects that introduce instability.
S – Shared fate	Resources that are tightly coupled across clouds, creating cascading failures.

The framework helps teams systematically evaluate where resilience can be improved across multiple cloud providers.

Multi‑Cloud Resilience: Myths & Realities

Complexity vs. Resilience – Adding cloud providers increases architectural complexity, which can reduce resilience if not managed carefully.
When Multi‑Cloud Helps – It is valuable for specific scenarios such as:
- Disaster recovery (DR) where regulatory or data‑sovereignty rules prevent data from leaving a country.
- Situations requiring geographic redundancy beyond a single provider’s regions.

Multi‑cloud is not a blanket solution; it must be applied intentionally with clear goals.

Monzo’s “Stand‑in” Platform

Purpose – Acts as a lifeboat strategy: a simplified banking system that can take over core transaction processing if the primary AWS environment fails.
Architecture – Runs on Google Cloud, mirroring essential services of Monzo’s main platform.
Cost – Operates at roughly 1 % of the primary platform’s cost.
Production Use – Processes real customer transactions daily for testing and has been used successfully during actual incidents.

Best Practices for Resilient Multi‑Cloud Operations

Fault Isolation

Keep clear boundaries between providers to avoid shared‑fate failures.
Use separate VPCs, IAM roles, and networking configurations per cloud.

Observability

Implement unified logging, metrics, and tracing that span all clouds.
Ensure alerts surface provider‑specific issues as well as cross‑cloud dependencies.

Comprehensive Testing

Conduct regular chaos engineering experiments that simulate provider outages, network partitions, and latency spikes.
Validate DR runbooks by actually failing over workloads between clouds.

Critical Dependency Management

Avoid single points of failure for services such as DNS, authentication, and configuration stores.
Deploy redundant instances of these services in each cloud, or use globally distributed solutions.

Conclusion

Multi‑cloud can enhance resilience when applied to well‑defined problems like regulatory‑driven DR or geographic redundancy. The SEEMS framework provides a structured way to identify and mitigate failure modes. Monzo’s “Stand‑in” platform demonstrates that a lightweight, cost‑effective backup environment can operate in production and serve as a reliable safety net. By enforcing fault isolation, maintaining robust observability, rigorously testing, and eliminating single points of failure, organizations can reap the benefits of multi‑cloud without succumbing to its added complexity.

AWS re:Invent 2025 - Architecting resilient multicloud operations, feat. Monzo Bank (HMC201)

Overview

The SEEMS Framework

Multi‑Cloud Resilience: Myths & Realities

Monzo’s “Stand‑in” Platform

Best Practices for Resilient Multi‑Cloud Operations

Fault Isolation

Observability

Comprehensive Testing

Critical Dependency Management

Conclusion

Related posts

We found our site was slow in Singapore but perfect in Europe — here's why

I put a Game Boy inside ChatGPT (ChatGPT Apps)

Advent of AI - Day 13: Goose Terminal Integration

A Day in the Life of a Marketing Manager Using Microsoft Planner