The Missing Link: Triggering Serverless Events from Legacy Databases with AWS DMS

Published: 1 week ago (January 7, 2026 at 02:09 PM EST)

4 min read

Source: Dev.to

We live in a world where we want everything to be event‑driven. A new user registration in our SQL database should immediately:

trigger a welcome email via SES,
update a CRM via API, and
start a Step Functions workflow.

If you’re building greenfield on DynamoDB, this is easy (DynamoDB Streams). But what if your data lives in a legacy MySQL monolith, an on‑premise Oracle DB, or a standard PostgreSQL instance?

You need Change Data Capture (CDC)—you need to stream those changes to the cloud.

Naturally you look at AWS DMS (Database Migration Service). It’s perfect for moving data, but you quickly hit a wall:

The Problem

AWS DMS cannot target an AWS Lambda function directly.
You can’t simply configure a task that says “When a row is inserted in Table X, invoke Function Y”.

So, how do we bridge the gap between the “old world” (SQL) and the “new world” (serverless)? While many suggest Kinesis, the most robust and cost‑effective answer is Amazon S3.

Below is the architecture pattern I use to modernize legacy back‑ends without rewriting them.

The Architecture: The “S3 Drop” Pattern

Source – DMS connects to your legacy database and captures changes (INSERT/UPDATE/DELETE) via the transaction logs.
Target – DMS writes those changes as JSON files into an S3 bucket.
Trigger – S3 detects the new file and fires an event notification.
Compute – Your Lambda function receives the event, reads the file, and processes the business logic.

Architecture diagram

Why S3 Instead of Kinesis or Airbyte?

Why not Kinesis Data Streams?

Cost – S3 is dramatically cheaper than a provisioned Kinesis stream, especially when the legacy DB is quiet.
Observability – You can literally see the changes as files in your bucket, making debugging 10× easier.
Batching – DMS writes to S3 in batches, naturally throttling Lambda invocations during massive write spikes.

Why not Airbyte or Fivetran?

Those tools excel at ELT pipelines (e.g., loading data into Snowflake every 15–60 minutes).
Our goal is event‑driven processing—trigger a Lambda as close to “real‑time” as possible.
AWS DMS offers continuous CDC, delivering a granular stream of events that batch‑based ELT tools often miss.
Staying 100 % AWS‑native simplifies IAM governance in strict enterprise environments.

Implementation Guide

DMS Endpoint Settings

When creating the target endpoint (S3) in DMS, don’t rely on defaults. Use the following Extra Connection Attributes so the output is Lambda‑friendly:

dataFormat=json;
datePartitionEnabled=true;

dataFormat=json – DMS defaults to CSV; JSON is far easier for Lambda to parse.
datePartitionEnabled=true – Organizes files by date (/2023/11/02/...), preventing a single folder from containing millions of objects.

Understanding the Event Structure

A typical DMS‑generated file looks like this (Line‑Delimited JSON, also known as NDJSON):

{
    "data": { "id": 101, "username": "jdoe", "status": "active" },
    "metadata": { "operation": "insert", "timestamp": "2023-11-02T10:00:00Z" }
}
{
    "data": { "id": 102, "username": "asmith", "status": "pending" },
    "metadata": { "operation": "update", "timestamp": "2023-11-02T10:05:00Z" }
}

Each line contains the operation (insert, update, delete) and the payload (data) in a clean package.

Lambda Logic

Because DMS writes NDJSON, you cannot json.loads() the whole file at once. You must iterate line‑by‑line.

Below is a Python boilerplate that handles the file correctly:

import boto3
import json

s3 = boto3.client('s3')

def handler(event, context):
    # 1️⃣ Extract bucket & key from the S3 event
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key    = record['s3']['object']['key']

        print(f"Processing file: s3://{bucket}/{key}")

        # 2️⃣ Retrieve the file generated by DMS
        obj = s3.get_object(Bucket=bucket, Key=key)
        content = obj['Body'].read().decode('utf-8')

        # 3️⃣ Parse NDJSON (line‑delimited JSON)
        for line in content.splitlines():
            if not line.strip():
                continue  # skip empty lines

            row = json.loads(line)

            # 4️⃣ Filter / act on the operation type
            operation = row.get('metadata', {}).get('operation')
            if operation == 'insert':
                user_data = row.get('data')
                # TODO: add your business logic for inserts
                print(f"INSERT: {user_data}")

            elif operation == 'update':
                user_data = row.get('data')
                # TODO: add your business logic for updates
                print(f"UPDATE: {user_data}")

            elif operation == 'delete':
                # Handle deletes if needed
                print("DELETE operation received")

Key points

Do not call json.loads(content) on the whole file.
Iterate content.splitlines() and parse each line individually.
Use the metadata.operation field to route your logic.

TL;DR

Capture CDC from any legacy RDBMS with AWS DMS.
Write the changes as JSON files to S3 (date‑partitioned).
Trigger a Lambda via S3 event notifications.
Parse the NDJSON payload line‑by‑line and implement your event‑driven business logic.

This “S3 Drop” pattern gives you a low‑cost, observable, and fully AWS‑native bridge between old‑school databases and modern serverless workflows. 🚀

print(f"New User Detected: {user_data['username']}")
# trigger_welcome_email(user_data)

elif operation == 'update':
    print(f"User Updated: {row['data']['id']}")

Summary

You don’t need to refactor your entire legacy database to get the benefits of serverless. By using AWS DMS to unlock the data and S3 as a reliable buffer, you can trigger modern Lambda workflows from 20‑year‑old databases with minimal friction. This pattern prioritizes stability and observability over raw speed—a trade‑off that is usually worth it in enterprise migrations.