Overview of Real-Time Data Synchronization from MySQL to VeloDB
Source: Dev.to
Flink can serve as a real‑time data‑synchronization engine when migrating data from MySQL‑compatible databases (e.g., Amazon Aurora) to VeloDB.
Its high‑throughput, low‑latency stream‑processing capabilities enable:
- Full data initial load – import existing tables from MySQL/Aurora into VeloDB.
- Incremental change capture – use MySQL binlog (CDC) to capture INSERT/UPDATE/DELETE events and continuously write them to VeloDB.
The overall architecture is illustrated below:
Example workflow
1. Create an AWS RDS Aurora MySQL instance
2. Create a MySQL database and corresponding tables
CREATE DATABASE test_db;
CREATE TABLE test_db.student (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
age INT,
email VARCHAR(255),
phone VARCHAR(20),
score DECIMAL(5,2),
created_at TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO test_db.student (id, name, age, email, phone, score, created_at)
VALUES
(1, 'Alice Zhang', 22, 'alice@example.com', '13800138000', 89.50, NOW()),
(2, 'Bob Li', 21, 'bob@example.com', '13900139000', 76.80, NOW()),
(3, 'Charlie Wang',23, 'charlie@example.com','13600136000',92.00, NOW()),
(4, 'David Chen', 20, 'david@example.com', '13500135000', 85.60, NOW()),
(5, 'Emma Liu', 22, 'emma@example.com', '13700137000', 78.90, NOW());
3. Create a VeloDB warehouse
4. Modify MySQL (Aurora) configuration
-
Create a parameter group and enable binlog.

-
Set
binlog_formattoROW.
-
Attach the parameter group to the DB cluster and restart the instance.

5. Install Flink with the Doris connector
5.1 Download the pre‑built package
A ready‑to‑use Flink 1.17 distribution (including the Doris connector) can be downloaded and extracted.
5.2 Manual installation (if you already have a Flink environment)
Download the following artifacts:
- Flink 1.17 (binary distribution)
- Flink MySQL CDC connector
- Flink Doris connector
- MySQL JDBC driver
Extract the Flink distribution:
tar -zxvf flink-1.17.2-bin-scala_2.12.tgz
Copy the connector JARs and the MySQL driver into flink-1.17.2-bin/lib:
6. Submit the Flink synchronization job
When the job runs, the Doris connector automatically creates the target tables in VeloDB based on the source MySQL schema. Flink can be deployed in Local, Standalone, YARN, or Kubernetes modes.
6.1 Local environment example
cd flink-1.17.2-bin
bin/flink run -t local \
-Dexecution.checkpointing.interval=10s \
-Dparallelism.default=1 \
-c org.apache.doris.flink.tools.cdc.CdcTools \
lib/flink-doris-connector-1.17-25.1.0.jar \
mysql-sync-database \
--database test_db \
--mysql-conf hostname=database-test.cluster-ro-ckbuyoqerz2c.us-east-1.rds.amazonaws.com \
--mysql-conf port=3306 \
--mysql-conf username=admin \
--mysql-conf password=YOUR_PASSWORD \
--mysql-conf server-id=5400 \
--mysql-conf server-time-zone=UTC \
--doris-conf fenodes=YOUR_VELODB_ENDPOINT \
--doris-conf username=YOUR_VELODB_USER \
--doris-conf password=YOUR_VELODB_PASSWORD \
--doris-conf table.identifier=test_db.student \
--doris-conf sink.enable-delete=true
Replace the placeholder values (YOUR_PASSWORD, YOUR_VELODB_ENDPOINT, etc.) with your actual credentials and endpoint information.
The job will:
- Perform an initial snapshot of
test_db.studentand load it into VeloDB. - Continuously read binlog events from Aurora MySQL and apply the corresponding inserts, updates, and deletes to the VeloDB table in real time.



