Overview of Real-Time Data Synchronization from MySQL to VeloDB

Published: (December 2, 2025 at 03:40 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Flink can serve as a real‑time data‑synchronization engine when migrating data from MySQL‑compatible databases (e.g., Amazon Aurora) to VeloDB.
Its high‑throughput, low‑latency stream‑processing capabilities enable:

  • Full data initial load – import existing tables from MySQL/Aurora into VeloDB.
  • Incremental change capture – use MySQL binlog (CDC) to capture INSERT/UPDATE/DELETE events and continuously write them to VeloDB.

The overall architecture is illustrated below:

Architecture diagram

Example workflow

1. Create an AWS RDS Aurora MySQL instance

Create Aurora instance

2. Create a MySQL database and corresponding tables

CREATE DATABASE test_db;

CREATE TABLE test_db.student (
    id          INT PRIMARY KEY,
    name        VARCHAR(100) NOT NULL,
    age         INT,
    email       VARCHAR(255),
    phone       VARCHAR(20),
    score       DECIMAL(5,2),
    created_at  TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

INSERT INTO test_db.student (id, name, age, email, phone, score, created_at)
VALUES
    (1, 'Alice Zhang', 22, 'alice@example.com', '13800138000', 89.50, NOW()),
    (2, 'Bob Li',      21, 'bob@example.com',   '13900139000', 76.80, NOW()),
    (3, 'Charlie Wang',23, 'charlie@example.com','13600136000',92.00, NOW()),
    (4, 'David Chen',  20, 'david@example.com', '13500135000', 85.60, NOW()),
    (5, 'Emma Liu',    22, 'emma@example.com',  '13700137000', 78.90, NOW());

3. Create a VeloDB warehouse

Create VeloDB warehouse

4. Modify MySQL (Aurora) configuration

  1. Create a parameter group and enable binlog.

    Parameter group

  2. Set binlog_format to ROW.

    binlog_format=ROW

  3. Attach the parameter group to the DB cluster and restart the instance.

    Apply parameter group

5.1 Download the pre‑built package

A ready‑to‑use Flink 1.17 distribution (including the Doris connector) can be downloaded and extracted.

Download the following artifacts:

  • Flink 1.17 (binary distribution)
  • Flink MySQL CDC connector
  • Flink Doris connector
  • MySQL JDBC driver

Extract the Flink distribution:

tar -zxvf flink-1.17.2-bin-scala_2.12.tgz

Copy the connector JARs and the MySQL driver into flink-1.17.2-bin/lib:

Copy JARs to lib directory

When the job runs, the Doris connector automatically creates the target tables in VeloDB based on the source MySQL schema. Flink can be deployed in Local, Standalone, YARN, or Kubernetes modes.

6.1 Local environment example

cd flink-1.17.2-bin
bin/flink run -t local \
    -Dexecution.checkpointing.interval=10s \
    -Dparallelism.default=1 \
    -c org.apache.doris.flink.tools.cdc.CdcTools \
    lib/flink-doris-connector-1.17-25.1.0.jar \
    mysql-sync-database \
    --database test_db \
    --mysql-conf hostname=database-test.cluster-ro-ckbuyoqerz2c.us-east-1.rds.amazonaws.com \
    --mysql-conf port=3306 \
    --mysql-conf username=admin \
    --mysql-conf password=YOUR_PASSWORD \
    --mysql-conf server-id=5400 \
    --mysql-conf server-time-zone=UTC \
    --doris-conf fenodes=YOUR_VELODB_ENDPOINT \
    --doris-conf username=YOUR_VELODB_USER \
    --doris-conf password=YOUR_VELODB_PASSWORD \
    --doris-conf table.identifier=test_db.student \
    --doris-conf sink.enable-delete=true

Replace the placeholder values (YOUR_PASSWORD, YOUR_VELODB_ENDPOINT, etc.) with your actual credentials and endpoint information.

The job will:

  1. Perform an initial snapshot of test_db.student and load it into VeloDB.
  2. Continuously read binlog events from Aurora MySQL and apply the corresponding inserts, updates, and deletes to the VeloDB table in real time.
Back to Blog

Related posts

Read more »