How World Bank manages hybrid cloud complexity with Terraform
Source: HashiCorp Blog
“Developers always take the easy path. So if you make the easiest way the right way, then you have the right outcome.”
— Suneer Pallitharammal Mukkolakal, Lead Platform Engineer, World Bank
That’s the philosophy that drove the World Bank’s platform‑engineering transformation. They went from 5‑day infrastructure‑provisioning timelines filled with manual processes and configuration drift to a 30‑minute self‑service platform that manages 27,000 cloud resources and supports 1,700 applications across Azure, AWS, and GCP.
The backbone of this new strategy was HCP Terraform, which the World Bank’s platform team used to build golden paths that make security, compliance, and best practices the default.
Learn how they turned their hybrid “snowflake” infrastructure pipelines into standardized platform products.
This blog post is based on a HashiConf session from Suneer Pallitharammal Mukkolakal, Lead Platform Engineer at the World Bank.
Hybrid‑cloud challenges
World Bank faced five categories of complexity challenges:
-
Manual processes
- Click‑ops and ticket‑ops everywhere.
- Configurations done by hand across multiple cloud subscriptions.
- Manual processes leave fingerprints that become technical debt you manage forever.
-
Configuration drift
- Inconsistent dev environments that looked nothing like production.
- Lack of standard workflows to prevent drift.
-
Compliance
- New cyber threats and regulations emerging every day.
- Cloud vendors constantly improving products, forcing everyone to play catch‑up.
- Keeping up with compliance requirements in a fragmented environment.
-
Bespoke apps
- Every app treated as a special case requiring custom care.
- Data teams needed handcrafted data platforms.
- App teams needed custom application‑hosting environments.
- Each request meant: gather requirements → sit with the team → design specifically for them → build and manage forever.
- Hundreds or thousands of these one‑off platforms ran across their estate.
-
Cognitive load
- Platform teams were managing unique “snowflakes.”
- Developers, data engineers, and data scientists had to deal with complex, inconsistent setups.
The turning point
The CIO put forward a transformative digital‑transformation plan that focused on building a strong platform‑engineering strategy.
Platform‑engineering strategy
The strategy is built on four pillars:
1. Developer experience
- Internal developer portal for self‑service capabilities.
- Golden paths for apps, data, and AI applications.
- Scorecards to track platform quality and usage.
2. Security by design
- Security policies converted to code (instead of sitting in Word documents).
- Version control, auditing, and automated pre‑deploy enforcement for all security policies.
- Strong secrets‑management strategy.
- Security scorecards for visibility.
3. Unified platform standards
- Reusable, composable Terraform modules as the foundation.
- Modules published in an internal private registry in the HashiCorp Cloud Platform (HCP) version of Terraform.
- Standard governance: changes happen through code with a proper pull‑request process.
- Standards can be changed and applied in a versioned manner.
4. AI‑embedded workflow
- Coding assistance for all developers.
- AI‑assisted policy checks.
- Strategy for AI‑generated test cases, intelligent observability, and AI Ops.
- Automated documentation (not just for humans; also for prompt libraries to guide AI coders).
Platform‑engineering framework
World Bank implemented the framework bottom‑up using a pyramid approach:
- Infrastructure modules – Build all necessary Terraform modules with security hardening and best practices baked in.
- Module bundles (“templates”) – Provide complete, usable deployments for developers (e.g., Terraform Stacks).
- Developer workflows – Orchestrate Day 1 and Day 2 functions with Terraform and other tools.
- Golden‑path encoding – Observe common patterns, encode them with best practices and controls, and expose them as options in a platform‑as‑a‑product.
- Storefront – Deliver everything through an internal developer portal.
The Terraform self‑service workflow
After the framework was built, developers could securely and quickly start using a pre‑built application‑deployment stack. The process:
- Developer requests a golden‑path product in the Internal Developer Portal (IDP).
- This triggers a platform‑management pipeline that:
- Deploys HCP Terraform workspaces.
- Prefills all variables.
- Connects to the agent that will execute.
- Configures the Git repository with golden‑path definitions.
- HCP Terraform runs
planandapplyto deploy the golden‑path patterns.
Platform engineers manage operations at the Git‑repository level, keeping maintenance simple for everyone.
Security embedded in the flow
Embedding security early (shift‑left) reduces redevelopment costs and yields better security overall. World Bank’s workflow:
- After
terraform plancompletes, the plan is sent to the World Bank security‑scanning tools via a Terraform run‑task integration. - Security scans happen at the gateway to provisioning.
- Infrastructure is deployed only if it complies with security standards.
- Policy‑as‑code checks also run at this stage and are enforced automatically.
- If everything passes, the infrastructure is deployed to the hybrid‑cloud environment (Azure, AWS, on‑prem, GCP) in a single unified workflow.
Diagram of the overall self‑service workflow (omitted here).
Architecture view
From an architect’s perspective, each standard template works as a self‑contained, reusable building block that can be combined, versioned, and governed centrally—delivering a consistent, secure, and observable platform experience across the World Bank’s multi‑cloud estate.
World Bank Platform Architecture Overview
The typical logical architecture for World Bank’s platform deployments consists of three planes that separate concerns and provide a clear path from infrastructure to developer experience.
| Plane | Description |
|---|---|
| Resource plane (bottom) | Services consumed from Azure, AWS, GCP, and on‑premises. Engineers have a wide choice of compute, networking, services, and data options. |
| Platform plane (middle) | Delivery pipelines built with Terraform, Ansible, and Argo CD. Two platform products—Application and Data—are offered as services. |
| Developer plane (top) | Developers, data engineers, and data scientists work with IDEs, MCP servers, and AI tools (e.g., Copilot). An internal developer portal (“storefront”) provides a simple web UI. |
Platform Products
1. Application Platform
Core stack
- VNet with network‑security controls
- UI, API, and database capabilities
- Standard runtimes: Node.js, Java, .NET (chosen based on actual usage)
- Database options: PostgreSQL, MySQL, Cosmos DB (NoSQL)
Optional capabilities (togglable)
- Serverless functions (Azure Functions, Lambda‑style)
- Caching (Redis, with built‑in security controls)
- Object storage for videos/files
Security & authentication (non‑negotiable)
- Authentication enabled everywhere (native where possible)
- Managed identity for most use cases
Standard components (always included)
- Monitoring, logging, key management, DNS, managed identity
Benefit: Developers receive security and observability out‑of‑the‑box, allowing them to focus on business logic. The toggle model lets applications start small and add features as they grow.
2. Data Platform
Compute options
- Analytical platforms – large clusters (Databricks, AWS EMR)
- Data movement – integration services (Azure Data Factory, AWS Glue)
- All protected with private endpoints
Storage options
- Data lake
- Relational databases
- NoSQL databases
AI capabilities
- Vector databases for Retrieval‑Augmented Generation (RAG)
- LLM APIs (Azure, OpenAI, Claude, etc.)
Security
- Private endpoints / private links across clouds
- Strong key‑management for database connections
Benefit: The toggle approach makes these capabilities self‑service; teams can enable what they need without contacting the platform team.
Day‑2 Operations for Platform Engineers
- Git‑centric management – all platform products live in GitHub and are codified in Terraform.
- Rapid versioning – e.g., a security vulnerability → update Terraform module → release new version.
- Scalable policy updates – security/compliance teams modify checks and policies as versioned code.
Result: Tasks that once took days now take minutes.
Key Results
| Metric | Impact |
|---|---|
| Delivery velocity | Provisioning time reduced from 5 days → 30 minutes (execution time only). |
| Standardization | 70 % of teams now use the standardized platform offerings; golden‑path optionality drives uniform service consumption. |
| Scale & reach | • 27 000 cloud resources deployed in record time. |
| • 1 700 applications supported by these patterns. | |
| • Infrastructure remains secure, consistent, and compliant. |
Lessons Learned
-
Start small, standardize, and automate relentlessly
- Identify a single bottleneck and automate it end‑to‑end.
-
Adopt modular, flexible architectures
- Keep Terraform modules minimal; avoid over‑engineering or over‑configuring.
-
Apply team topologies & agile ways of working
- Align security, development, data‑science, and compliance teams for incremental progress.
- Use clear patterns that serve the majority of use cases; accept that a few “snowflake” workloads will remain.
-
Embed AI in platform‑engineering workflows
- Policy checks, coding assistance, and AI‑powered automation throughout the pipeline.
-
Make the easy path the right path
- Provide secure, compliant defaults that are effortless for developers.
- Reducing operational friction prevents shadow‑IT and keeps teams focused on value‑adding work.
The above markdown preserves the original structure and content while improving readability and navigation.
## Creativity and Value
> "Your developers' creative work truly begins where their wait time ends."
> — *Suneer Pallitharammal Mukkolakal*, Lead Platform Engineer, World Bank
To learn more about how we can help your company navigate the complexities of hybrid infrastructure for more secure, automated operations, read our **guide to navigating cloud complexity** and drop us a line to talk about your unique IT challenges.
### Watch the Full Session
[Watch the full session from HashiConf below](#)