MySQL에서 VeloDB로 실시간 데이터 동기화 개요
Source: Dev.to
Flink은 MySQL 호환 데이터베이스(예: Amazon Aurora)에서 VeloDB로 데이터를 마이그레이션할 때 실시간 데이터 동기화 엔진으로 활용될 수 있습니다.
높은 처리량과 낮은 지연 시간을 갖춘 스트림 처리 기능을 통해 다음을 지원합니다:
- 전체 데이터 초기 로드 – MySQL/Aurora에 존재하는 테이블을 VeloDB로 가져옵니다.
- 증분 변경 캡처 – MySQL binlog(CDC)를 사용해 INSERT/UPDATE/DELETE 이벤트를 캡처하고 이를 지속적으로 VeloDB에 기록합니다.
전체 아키텍처는 아래와 같이 나타냅니다:
Example workflow
1. Create an AWS RDS Aurora MySQL instance
2. Create a MySQL database and corresponding tables
CREATE DATABASE test_db;
CREATE TABLE test_db.student (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
age INT,
email VARCHAR(255),
phone VARCHAR(20),
score DECIMAL(5,2),
created_at TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO test_db.student (id, name, age, email, phone, score, created_at)
VALUES
(1, 'Alice Zhang', 22, 'alice@example.com', '13800138000', 89.50, NOW()),
(2, 'Bob Li', 21, 'bob@example.com', '13900139000', 76.80, NOW()),
(3, 'Charlie Wang',23, 'charlie@example.com','13600136000',92.00, NOW()),
(4, 'David Chen', 20, 'david@example.com', '13500135000', 85.60, NOW()),
(5, 'Emma Liu', 22, 'emma@example.com', '13700137000', 78.90, NOW());
3. Create a VeloDB warehouse
4. Modify MySQL (Aurora) configuration
-
Create a parameter group and enable binlog.

-
Set
binlog_formattoROW.
-
Attach the parameter group to the DB cluster and restart the instance.

5. Install Flink with the Doris connector
5.1 Download the pre‑built package
A ready‑to‑use Flink 1.17 distribution (including the Doris connector) can be downloaded and extracted.
5.2 Manual installation (if you already have a Flink environment)
Download the following artifacts:
- Flink 1.17 (binary distribution)
- Flink MySQL CDC connector
- Flink Doris connector
- MySQL JDBC driver
Extract the Flink distribution:
tar -zxvf flink-1.17.2-bin-scala_2.12.tgz
Copy the connector JARs and the MySQL driver into flink-1.17.2-bin/lib:
6. Submit the Flink synchronization job
When the job runs, the Doris connector automatically creates the target tables in VeloDB based on the source MySQL schema. Flink can be deployed in Local, Standalone, YARN, or Kubernetes modes.
6.1 Local environment example
cd flink-1.17.2-bin
bin/flink run -t local \
-Dexecution.checkpointing.interval=10s \
-Dparallelism.default=1 \
-c org.apache.doris.flink.tools.cdc.CdcTools \
lib/flink-doris-connector-1.17-25.1.0.jar \
mysql-sync-database \
--database test_db \
--mysql-conf hostname=database-test.cluster-ro-ckbuyoqerz2c.us-east-1.rds.amazonaws.com \
--mysql-conf port=3306 \
--mysql-conf username=admin \
--mysql-conf password=YOUR_PASSWORD \
--mysql-conf server-id=5400 \
--mysql-conf server-time-zone=UTC \
--doris-conf fenodes=YOUR_VELODB_ENDPOINT \
--doris-conf username=YOUR_VELODB_USER \
--doris-conf password=YOUR_VELODB_PASSWORD \
--doris-conf table.identifier=test_db.student \
--doris-conf sink.enable-delete=true
Replace the placeholder values (YOUR_PASSWORD, YOUR_VELODB_ENDPOINT, etc.) with your actual credentials and endpoint information.
The job will:
- Perform an initial snapshot of
test_db.studentand load it into VeloDB. - Continuously read binlog events from Aurora MySQL and apply the corresponding inserts, updates, and deletes to the VeloDB table in real time.



