๐Ÿš€ Real-Time Data Replication Using MySQL, Debezium, Kafka, and Docker (CDC Guide)

Published: (February 19, 2026 at 10:11 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

๐Ÿ“Œ Introduction

Modern applications often need data to move between systems in real time โ€” analytics platforms, microservices, search indexes, or backup databases.

Traditional approaches like batch jobs or cronโ€‘based sync introduce delays, inconsistencies, and operational complexity.

This is where Change Data Capture (CDC) becomes powerful.

In this article, weโ€™ll build a simple but powerful realโ€‘time database replication pipeline using:

  • MySQL
  • Debezium
  • Apache Kafka
  • Kafka Connect (JDBC Sink)
  • Docker Compose

By the end, youโ€™ll have a working system that automatically replicates inserts, updates, and deletes from one database to another.


๐Ÿ”ฅ What is Change Data Capture (CDC)?

Change Data Capture is a technique used to capture database changes (INSERT, UPDATE, DELETE) and stream them to other systems in real time.

Instead of polling the database repeatedly, CDC reads the database transaction log (binlog in MySQL).

This makes CDC:

  • โœ… Realโ€‘time
  • โœ… Efficient
  • โœ… Reliable
  • โœ… Scalable

โ“ Why Do We Need CDC?

Common use cases

  • Realโ€‘time analytics pipelines
  • Microservices data synchronization
  • Data warehousing
  • Cache invalidation
  • Eventโ€‘driven architectures
  • Search indexing (Elasticsearch)
  • Zeroโ€‘downtime migrations

Without CDC

App โ†’ DB โ†’ Batch Job โ†’ Other Systems

With CDC

App โ†’ DB โ†’ CDC Stream โ†’ Multiple Systems

Much faster and cleaner.


๐Ÿ— Architecture Overview

We will build the following pipeline:

MySQL Source
     โ†“ (binlog)
Debezium Connector
     โ†“
Kafka Topic
     โ†“
JDBC Sink Connector
     โ†“
MySQL Target

๐Ÿณ Step 1 โ€” Docker Compose Setup

Weโ€™ll run everything using Docker so the setup is easy and reproducible.

Create a file called docker-compose.yml:

services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

  mysql-source:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: company
    command: >
      --server-id=1
      --log-bin=mysql-bin
      --binlog-format=ROW
    ports:
      - "3306:3306"

  mysql-target:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: root
    ports:
      - "3307:3306"

  connect:
    image: debezium/connect:2.5
    depends_on:
      - kafka
      - mysql-source
    ports:
      - "8083:8083"
    environment:
      BOOTSTRAP_SERVERS: kafka:9092
      GROUP_ID: 1
      CONFIG_STORAGE_TOPIC: connect_configs
      OFFSET_STORAGE_TOPIC: connect_offsets
      STATUS_STORAGE_TOPIC: connect_status

Start everything:

docker compose up -d

Give it ~30โ€ฏseconds for services to fully start.


๐Ÿ—„ Step 2 โ€” Create Database and Table (Source)

Because MySQL is running inside Docker, we execute commands using Docker.

docker compose exec -T mysql-source mysql -uroot -proot ..`).  

---  

### ๐Ÿš€ Production Best Practices  

- Use **Schema Registry** for schema evolution.  
- Enable **monitoring & alerting** (Prometheus, Grafana).  
- Configure a **Deadโ€‘Letter Queue (DLQ)** for problematic records.  
- Use migration tools (Flyway, Liquibase) for schema changes.  
- Secure credentials (Docker secrets, Vault, or environment variables).  

---  

### โœ… Conclusion  

Change Data Capture (CDC) enables realโ€‘time data movement with minimal overhead.  
Debezium + Kafka Connect provides:

- Scalability  
- Reliability  
- Low latency  
- Eventโ€‘driven architecture  

This pattern is widely used in modern distributed systems.

---  

### ๐Ÿ™Œ Final Thoughts  

If you work in:

- Data Engineering  
- DevOps  
- Backend Systems  
- Platform Engineering  

learning CDC is extremely valuable.

โญ If you enjoyed this article, feel free to connect and share your feedback!
0 views
Back to Blog

Related posts

Read more ยป

Warm Introduction

Introduction Hello everyone! I'm fascinated by the deep tech discussions here. It's truly amazing to see the community thrive. Project Overview I'm passionate...