How I Built a Production-Grade Multi-Tier Application on AWS ECS Fargate (A Complete Case Study)
Source: Dev.to
Project Summary
The system is a simple two‑service architecture:
- React frontend served by Nginx
- Node.js backend API
The result: a fully private backend and a publicly accessible frontend communicating securely inside the VPC.
High‑Level Architecture
- Public ALB → receives internet traffic
- Internal ALB → routes traffic to the backend (zero exposure to the public internet)
Network Design
- VPC
- Subnets (public & private)
- Route tables
- Security groups
Traffic path:
- Internet → Public ALB
- Public ALB → Frontend service (public subnet)
- Frontend → Internal ALB → Backend service (private subnet)
Containers & Dockerfiles
- Frontend: Nginx multi‑stage build
- Backend: Node.js
Both images are built locally and pushed to Amazon ECR.
ECR + IAM Setup
- Two repositories:
frontend,backend - IAM role with permissions to pull images from ECR
- VPC endpoints added for ECR to eliminate image‑pull timeouts in private subnets
ECS Design
- Cluster (Fargate)
- Task definitions for frontend and backend
- Services with rolling deployments for updates
Load Balancing & Routing
- Frontend service attached to the public ALB
- Backend service attached to the internal ALB
- Frontend communicates with backend via the internal ALB
Rolling Deployments
Flow for a new image rollout:
- Push new image to ECR
- Update task definition with new image tag
- Service performs a rolling update:
- New tasks are launched, ENIs are provisioned in private subnets
- New tasks register with the internal ALB
- Old tasks are drained and stopped
Testing confirmed proper ENI provisioning, ALB registration, log streaming, and graceful connection draining.
Key Metrics From the Deployment
- 0 public IPs on ECS tasks (all tasks run in private subnets)
Challenges & Fixes
- ENI attachment delays – mitigated by adjusting task placement strategies.
- Image pull timeouts – resolved by adding VPC endpoints for ECR.
- ALB health‑check failures – fixed by tuning health‑check paths and thresholds.
What I Learned
- How Fargate attaches ENIs inside private subnets.
- The importance of VPC endpoint configuration for private image pulls.
- Strategies for zero‑downtime rolling deployments on ECS/Fargate.
Why This Project Mattered
This wasn’t just a deployment; it was a deep dive into real cloud systems—handling failures, debugging networking, managing IAM restrictions, and redesigning architecture as needed. It combined:
- VPC networking fundamentals
- Container image lifecycle (Docker → ECR → Fargate)
- Service discovery and load balancing
- Automated, production‑grade rollout processes
Repository
Conclusion
Anyone learning DevOps or AWS should attempt a project like this. It forces you to think like an engineer designing real systems, not just someone running commands. It also builds confidence that you can architect and debug production‑style systems from scratch.
Feel free to reach out if you’re working on similar projects or want to discuss cloud architectures.