완전 초보자를 위한 Blue-Green Deployment와 Nginx 및 실시간 알림 가이드

발행: 2개월 전 (2025년 12월 9일 오후 11:49 GMT+9)

8 분 소요

Source: Dev.to

Introduction

이 포괄적인 Blue‑Green Deployment 가이드에 오신 것을 환영합니다 – Netflix, Amazon, Facebook 등과 같은 기업이 무중단 배포를 위해 사용하는 강력한 배포 전략입니다. 이 프로젝트는 자동 페일오버와 실시간 Slack 알림이 포함된 프로덕션 수준의 블루‑그린 배포 시스템을 구현하는 방법을 보여줍니다.

Repository: HNG DevOps on GitHub

What You’ll Learn

블루‑그린 배포가 무엇이며 왜 중요한지
Nginx를 이용한 자동 페일오버 구현 방법
실시간 모니터링 및 알림 시스템 구축 방법
무중단 배포 달성 방법
DevOps 알림을 위한 Slack 연동 방법

Prerequisites

Docker에 대한 기본 이해
커맨드 라인 사용 경험
웹 서버에 대한 기본 지식(있으면 좋음)

What is Blue‑Green Deployment?

The Problem: Traditional Deployments

새 버전의 웹사이트를 배포할 때 일반적으로 다음과 같은 과정을 거칩니다:

기존 버전 중지
새 버전 배포
새 버전 시작

이 과정에서 사이트는 DOWN 상태가 되며, 사용자는 오류를 보게 되고 매출 손실이 발생할 수 있습니다.

The Solution: Blue‑Green Deployment

두 개의 동일한 환경을 유지합니다:

BLUE (Production) – 현재 사용자에게 서비스 중인 환경
GREEN (Staging) – 라이브될 새 버전이 대기 중인 환경

배포 워크플로우:

새 버전을 GREEN에 배포
GREEN을 충분히 테스트
트래픽을 BLUE에서 GREEN으로 즉시 전환
문제가 발생하면 BLUE로 즉시 되돌림

결과: ZERO DOWNTIME

Real‑World Analogy

두 개의 무대가 있는 콘서트를 생각해 보세요:

Stage 1 (Blue): 밴드가 공연 중, 관객이 관람
Stage 2 (Green): 다음 밴드가 준비 및 사운드 체크

전환 시점에 무대를 180° 회전하면 관객은 이제 Stage 2 (Green)를 보게 됩니다. 새 밴드에 문제가 생기면 바로 원래 무대로 다시 회전합니다.

Project Overview

이 프로젝트는 다음과 같은 핵심 기능을 갖춘 프로덕션‑레디 블루‑그린 배포 시스템을 구현합니다.

Core Features

Automatic Failover

Nginx가 Blue 인스턴스 장애를 감지
자동으로 모든 트래픽을 Green으로 라우팅
사용자에게 실패 요청이 전혀 없음

Real‑Time Alerting

Python watcher가 Nginx 로그를 모니터링
페일오버 이벤트를 즉시 감지
Slack에 알림 전송
오류 비율 모니터링

Zero‑Downtime Deployment

비활성 인스턴스에 새 버전 배포
트래픽을 즉시 전환
필요 시 몇 초 만에 롤백

Structured Logging

모든 요청을 메타데이터(풀, 릴리즈 버전, 응답 시간)와 함께 기록

Understanding the Core Concepts

1. Blue‑Green vs. Load Balancing

로드 밸런싱은 트래픽을 여러 인스턴스에 분산합니다:

Request 1 → Blue
Request 2 → Green
Request 3 → Blue
Request 4 → Green

**블루‑그린(이 프로젝트)**은 모든 트래픽을 단일 기본 인스턴스로 라우팅하고, 다른 인스턴스는 핫 스탠바이 역할을 합니다:

All Requests → Blue (Primary)
               Green (Backup, standby)

If Blue fails:
All Requests → Green (Backup becomes active)

2. Nginx Upstream Configuration

Nginx는 여러 백엔드 서버(업스트림)로 트래픽을 라우팅합니다. 이 프로젝트에서 사용하는 설정은 다음과 같습니다:

upstream app {
    server app_blue:3000 max_fails=1 fail_timeout=5s;
    server app_green:3000 backup max_fails=1 fail_timeout=5s;
}

핵심 디렉티브

server app_blue:3000 – 기본 서버
backup – 기본 서버가 실패했을 때만 사용
max_fails=1 – 한 번의 오류 후 실패로 간주
fail_timeout=5s – 5초 후 재시도

3. Failover Mechanism

요청이 실패하면 다음과 같이 진행됩니다:

Nginx가 Blue 인스턴스를 시도
Blue가 5xx 오류를 반환하거나 타임아웃되면 실패로 표시
Nginx가 백업인 Green에 요청 재시도
사용자는 Green으로부터 정상 응답을 받음
이후 요청은 Green으로 라우팅

결과: 사용자는 오류를 전혀 보지 않음.

Architecture Overview

System Architecture Diagram

                Internet
                   |
                   v
            +--------------+
            |    Nginx     |
            |  (Port 8080) |
            +--------------+
                   |
    +--------------+--------------+
    |                             |
    v                             v
+----------+                +----------+
| App Blue | (Primary)      |App Green | (Backup)
|Port 3000 |                |Port 3000 |
+----------+                +----------+
    |                             |
    +-------------+---------------+
                  |
                  v
          +--------------+
          | Nginx Logs   |
          +--------------+
                  |
                  v
        +------------------+
        | Alert Watcher    |
        | (Python)         |
        +------------------+
                  |
                  v
          +--------------+
          |    Slack     |
          +--------------+

Component Breakdown

1. Nginx Proxy

역할: 트래픽 라우터 및 로드 밸런서
책임: 모든 들어오는 요청 라우팅, 백엔드 장애 감지, 자동 페일오버 수행, 메타데이터와 함께 모든 요청 로그 기록

2. App Blue (Primary Instance)

Environment Variables

APP_POOL=blue
RELEASE_ID=blue-release-1.0.0
PORT=3000

3. App Green (Backup Instance)

Environment Variables

APP_POOL=green
RELEASE_ID=green-release-1.0.0
PORT=3000

4. Alert Watcher (Python)

역할: 실시간 로그 모니터링 및 알림
책임: Nginx 접근 로그를 tail, 구조화된 로그 파싱, 페일오버 이벤트 감지, 오류 비율 모니터링, Slack 알림 전송

Technology Stack

Component	Technology	Purpose
Reverse Proxy	Nginx (Alpine)	Traffic routing & failover
Application	Python/Flask	Demo web application
Monitoring	Python 3.11	Log watcher & alerting
Alerting	Slack Webhooks	Real‑time notifications
Containerization	Docker	Package all services
Orchestration	Docker Compose	Manage multi‑container setup

Setting Up the Project

Step 1: Clone the Repository

git clone https://github.com/cypher682/hng13-stage-3-devops.git
cd hng13-stage-3-devops

Step 2: Understand the Project Structure

hng13-stage-3-devops/
├── docker-compose.yml       # Container orchestration
├── nginx.conf.template      # Nginx configuration template
├── entrypoint.sh            # Nginx startup script
├── watcher.py               # Python log monitoring script
├── requirements.txt         # Python dependencies
├── .env.example             # Environment variables template
├── test-failover.sh         # Failover testing script
└── public/                  # Static HTML files

Step 3: Configure Environment Variables

Copy the example environment file and edit it:

cp .env.example .env

# Application Configuration
PORT=3000
ACTIVE_POOL=blue

# Docker Images
BLUE_IMAGE=yimikaade/wonderful:latest
GREEN_IMAGE=yimikaade/wonderful:latest

# Release Identifiers
RELEASE_ID_BLUE=blue-release-1.0.0
RELEASE_ID_GREEN=green-release-1.0.0

# Slack Integration
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/...

Step 4: Start the Services

docker compose up -d

Step 5: Test Failover

Run the provided script to simulate a failure and observe automatic failover and Slack alerts:

./test-failover.sh

이제 자동 페일오버, 구조화된 로깅, 실시간 Slack 알림이 포함된 완전한 블루‑그린 배포 환경을 갖추었습니다. 즐거운 배포 되세요!