AWS re:Invent 2025 - Advanced data modeling with Amazon DynamoDB (DAT414)

Published: (December 5, 2025 at 06:25 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Overview

AWS re:Invent 2025 – Advanced data modeling with Amazon DynamoDB (DAT414)

In this session Alex DeBrie covers advanced DynamoDB data modeling, focusing on three key areas:

  1. Secondary indexes – the new multi‑attribute composite keys feature that eliminates synthetic‑key overhead.
  2. Schema evolution – strategies for handling new attributes and back‑filling existing data.
  3. Common anti‑patterns – “kitchen‑sink” item collections, over‑normalization, and how to avoid them.

He also emphasizes DynamoDB’s partitioning model, consumption‑based pricing, and API design to achieve consistent performance at scale while keeping implementations simple and cost‑effective.

This article is auto‑generated from the original presentation. Minor typos or inaccuracies may be present.

Introduction: Advanced Data Modeling with Amazon DynamoDB

Thumbnail 0

Thank you all for coming. This is Advanced Data Modeling with Amazon DynamoDB. I’m Alex DeBrie, AWS Data Hero. This is my seventh year speaking at re:Invent.

In an hour I can’t cover everything about DynamoDB, but I’ll walk through:

  • Background and data‑modeling goals.
  • Using secondary indexes effectively.
  • Recent schema‑evolution features (released two weeks ago).
  • A quick anti‑pattern clinic.

Feel free to check out prior years’ talks on YouTube for deeper dives on related topics.

DynamoDB’s Unique Characteristics: Fully Managed, Consumption‑Based Pricing, and Consistent Performance

Thumbnail 140

Fully Managed

DynamoDB runs on a region‑wide, multi‑tenant fleet of storage nodes, load balancers, and request routers. The service is self‑healing and cannot be taken down by a single user. Unlike relational databases or OpenSearch, you don’t manage instances, patches, or scaling operations.

Consumption‑Based Pricing

You pay for Read Capacity Units (RCUs) and Write Capacity Units (WCUs) (or on‑demand throughput), not for provisioned CPU, memory, or storage. This model aligns cost directly with actual usage, making it easier to predict expenses for variable workloads.

Consistent Performance

Because DynamoDB partitions data based on the partition key’s hash value, it can deliver single‑digit‑millisecond latency at any scale, provided the access pattern distributes evenly across partitions.

Data‑Modeling Goals & Process

  1. Define access patterns first – know the queries you need to support.
  2. Choose partition and sort keys that enable those queries without secondary indexes when possible.
  3. Add secondary indexes (GSI/LSI) only when a pattern cannot be satisfied by the primary key.
  4. Iterate – refine the model as new requirements emerge.

Secondary Indexes & Multi‑Attribute Composite Keys

The new composite‑key feature lets you combine multiple attributes into a single index key, removing the need for synthetic “join” attributes. This reduces item size and simplifies query logic.

Example (pseudo‑code):

{
  "PK": "USER#123",
  "SK": "ORDER#2025-09-01",
  "GSI1PK": "STATUS#COMPLETED#DATE#2025-09-01",
  "GSI1SK": "ORDER#12345"
}

The GSI1PK combines status and date, enabling efficient queries like “all completed orders on a given day” without extra attributes.

Schema Evolution

When adding new attributes:

  • Back‑fill existing items using a DynamoDB Scan + BatchWrite or an AWS Lambda that processes DynamoDB Streams.
  • Leverage default values in application code for items that lack the new attribute.
  • Version your items (e.g., schemaVersion attribute) to handle multiple schema versions gracefully.

Anti‑Pattern Clinic

Anti‑PatternWhy It’s ProblematicRecommended Fix
Kitchen‑sink item collections – storing unrelated data in a single itemLarge items increase read/write latency and cost; hit size limits (400 KB).Split into multiple items or use separate tables/GSIs.
Over‑normalization – scattering related data across many tablesRequires multiple round‑trips; defeats DynamoDB’s single‑digit‑ms latency promise.Denormalize where read performance matters; use composite keys to group related entities.
Synthetic keys for every relationshipAdds unnecessary attributes and storage overhead.Use multi‑attribute composite keys or sparse indexes.

Closing

I’ll be at the DynamoDB booth in the expo hall this afternoon—feel free to stop by with questions. The session slides are available on the re:Invent video page, and the full recording can be watched here:

Watch the session on YouTube

Back to Blog

Related posts

Read more »