🚀 Real-Time Data Replication Using MySQL, Debezium, Kafka, and Docker (CDC Guide)

Published: 3 days ago (February 19, 2026 at 10:11 PM EST)

3 min read

Source: Dev.to

📌 Introduction

Modern applications often need data to move between systems in real time — analytics platforms, microservices, search indexes, or backup databases.

Traditional approaches like batch jobs or cron‑based sync introduce delays, inconsistencies, and operational complexity.

This is where Change Data Capture (CDC) becomes powerful.

In this article, we’ll build a simple but powerful real‑time database replication pipeline using:

MySQL
Debezium
Apache Kafka
Kafka Connect (JDBC Sink)
Docker Compose

By the end, you’ll have a working system that automatically replicates inserts, updates, and deletes from one database to another.

🔥 What is Change Data Capture (CDC)?

Change Data Capture is a technique used to capture database changes (INSERT, UPDATE, DELETE) and stream them to other systems in real time.

Instead of polling the database repeatedly, CDC reads the database transaction log (binlog in MySQL).

This makes CDC:

✅ Real‑time
✅ Efficient
✅ Reliable
✅ Scalable

❓ Why Do We Need CDC?

Common use cases

Real‑time analytics pipelines
Microservices data synchronization
Data warehousing
Cache invalidation
Event‑driven architectures
Search indexing (Elasticsearch)
Zero‑downtime migrations

Without CDC

App → DB → Batch Job → Other Systems

With CDC

App → DB → CDC Stream → Multiple Systems

Much faster and cleaner.

🏗 Architecture Overview

We will build the following pipeline:

MySQL Source
     ↓ (binlog)
Debezium Connector
     ↓
Kafka Topic
     ↓
JDBC Sink Connector
     ↓
MySQL Target

🐳 Step 1 — Docker Compose Setup

We’ll run everything using Docker so the setup is easy and reproducible.

Create a file called docker-compose.yml:

services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

  mysql-source:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: company
    command: >
      --server-id=1
      --log-bin=mysql-bin
      --binlog-format=ROW
    ports:
      - "3306:3306"

  mysql-target:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: root
    ports:
      - "3307:3306"

  connect:
    image: debezium/connect:2.5
    depends_on:
      - kafka
      - mysql-source
    ports:
      - "8083:8083"
    environment:
      BOOTSTRAP_SERVERS: kafka:9092
      GROUP_ID: 1
      CONFIG_STORAGE_TOPIC: connect_configs
      OFFSET_STORAGE_TOPIC: connect_offsets
      STATUS_STORAGE_TOPIC: connect_status

Start everything:

docker compose up -d

Give it ~30 seconds for services to fully start.

🗄 Step 2 — Create Database and Table (Source)

Because MySQL is running inside Docker, we execute commands using Docker.

docker compose exec -T mysql-source mysql -uroot -proot ..`).  

---  

### 🚀 Production Best Practices  

- Use **Schema Registry** for schema evolution.  
- Enable **monitoring & alerting** (Prometheus, Grafana).  
- Configure a **Dead‑Letter Queue (DLQ)** for problematic records.  
- Use migration tools (Flyway, Liquibase) for schema changes.  
- Secure credentials (Docker secrets, Vault, or environment variables).  

---  

### ✅ Conclusion  

Change Data Capture (CDC) enables real‑time data movement with minimal overhead.  
Debezium + Kafka Connect provides:

- Scalability  
- Reliability  
- Low latency  
- Event‑driven architecture  

This pattern is widely used in modern distributed systems.

---  

### 🙌 Final Thoughts  

If you work in:

- Data Engineering  
- DevOps  
- Backend Systems  
- Platform Engineering  

learning CDC is extremely valuable.

⭐ If you enjoyed this article, feel free to connect and share your feedback!

🚀 Real-Time Data Replication Using MySQL, Debezium, Kafka, and Docker (CDC Guide)

📌 Introduction

🔥 What is Change Data Capture (CDC)?

❓ Why Do We Need CDC?

🏗 Architecture Overview

🐳 Step 1 — Docker Compose Setup

🗄 Step 2 — Create Database and Table (Source)

Related posts

The Illusion of Digital Sovereignty: Why Vendor Swapping is Not a Compliance Strategy

Warm Introduction

Visual Studio Weekly: Copilot Memories, AI-Powered Testing, and Custom Agents

Customer Lifetime Value (CLV) Prediction with Machine Learning