Redis vs DynamoDB vs DAX: I Benchmarked AWS Caching Performance (The Results Were Unexpected)

Published: 3 months ago (February 2, 2026 at 03:00 AM EST)

7 min read

Source: Dev.to

Source: Dev.to

Benchmarking In‑Memory Caching for DynamoDB Reads

In many backend systems, user data is fetched on almost every request. A common assumption is that adding an in‑memory cache will improve read performance for any system.

To validate this assumption, I benchmarked three approaches for accessing user data from DynamoDB inside a serverless architecture:

Approach	Description
Baseline	AWS Lambda + DynamoDB (no cache)
Cache‑aside	Lambda → Redis → DynamoDB
DAX	Lambda → DynamoDB Accelerator (DAX) → DynamoDB

The goal was to see whether a CTO of a startup should invest in caching services. I expected the cached approaches to outperform the baseline easily, but even at 200 requests / second (12 000 requests / minute) that assumption didn’t hold.

Experimental Setup

The architecture for all three tests is as identical as possible – the cheapest available options in the eu‑central‑1 AWS region.

1. Baseline (DDB + Lambda)

Component	Configuration
Lambda	Python 3.12, 256 MiB memory, 10 s timeout
DynamoDB	Pay‑per‑request billing mode
Access pattern	Direct reads from DynamoDB (no cache)
Latency	Highest (no cache)

2. DAX (DynamoDB Accelerator)

Component	Configuration
Lambda	Python 3.12, 256 MiB memory, 10 s timeout
DAX cluster	`dax.t3.small` (single node, replication factor = 1) deployed in VPC isolated subnets
Cache	Managed by DAX automatically (default item TTL ≈ 5 min, query cache TTL ≈ 5 min)
Access pattern	Lambda → DAX → DynamoDB
Client	`amazondax` Python client (installed via pip)

3. Redis (AWS ElastiCache)

Component	Configuration
Lambda	Python 3.12, 256 MiB memory, 10 s timeout
Redis cluster	`cache.t4g.micro` (single node, no automatic failover) deployed in VPC isolated subnets
Cache TTL	30 seconds (configurable via `REDIS_TTL_SECONDS`)
Access pattern	1️⃣ Lambda checks Redis first 2️⃣ On miss → read from DynamoDB, store in Redis with 30 s TTL 3️⃣ On hit → return cached data
Client	Standard `redis-py` client with 1 s connection timeout

Test Methodology

Load levels – 50 reads / s (3 000 reads / min) and 200 reads / s (12 000 reads / min).
Payload – Same size for all runs, using a hot/mixed key distribution.
Metric – 95th‑percentile latency (p95).
Tool – Open‑source load‑testing library k6 (JavaScript scripts).

All three approaches fetched the same DynamoDB item, e.g.:

{
  "pk": "ITEM#123",
  "sk": "META",
  "itemId": "123",
  "title": "Example Item Title",
  "body": "This is the body content of the item",
  "updatedAt": 1736467200,
  "etag": "a7f8d9e1c2b3a4f5e6d7c8b9a0f1e2d3c4b5a6f7e8d9c0b1a2f3e4d5c6b7a8f9"
}

Results – 50 RPS (Establishing the Baseline)

Access Pattern	p95 Latency (ms)	Avg Latency (ms)	Dropped Iterations	Notes
Lambda + DynamoDB (Baseline)	~63	~48	0	Fast, stable, no bottlenecks
Redis (warm‑up run)	~68	~66	22	Cache misses + write‑back cost
Redis (steady state)	~63	~48	0	Matches baseline, no latency win
DAX (single small node)	~1040	~957	19	Cache saturation, unusable

Interpretation

Baseline – At 50 RPS the Lambda + DynamoDB combo delivered a p95 latency of ~63 ms with zero drops. DynamoDB on‑demand was not under pressure.
Redis – The first (warm‑up) run suffered cache misses, inflating latency. Once the cache warmed, latency matched the baseline but did not improve it.
DAX – The undersized dax.t3.small node became CPU‑bound, causing request queuing and p95 latencies > 1 s. This demonstrates that a mis‑sized cache can degrade performance.

Conclusion (50 RPS)

The baseline performed excellently and required no additional caching. Adding Redis reduced DynamoDB load but did not yield a measurable latency benefit. DAX, when undersized, was detrimental.

Results – 200 RPS (The Expected Crossover That Didn’t Happen)

Access Pattern	p95 Latency (ms)	Avg Latency (ms)	Dropped Iterations	Notes
Lambda + DynamoDB (Baseline)	~63	~48	13	Stable, scales linearly
Redis (warm run)	~64	~52	40	Cache population under load
Redis (steady state)	~70	~58	79	Slightly worse than baseline
DAX (single small node)	~1050	~968	5 399	Cluster saturation

Interpretation

Baseline – Even at 200 RPS the Lambda + DynamoDB setup maintained ~63 ms p95 latency, confirming its scalability.
Redis – Again showed two phases. The warm‑up run incurred miss latency; the steady‑state run was marginally slower than the baseline, likely due to added network hops and occasional cache misses under load.
DAX – The single small node was completely saturated, resulting in massive latency and thousands of dropped iterations.

Conclusion (200 RPS)

The baseline remains the best‑performing, simplest solution. Redis does not provide a latency advantage at these loads, and an undersized DAX cluster harms performance dramatically.

Overall Takeaways

Baseline Lambda + DynamoDB is often sufficient – On‑demand billing and automatic scaling keep latency low even at 200 RPS.
Cache‑aside Redis can reduce DynamoDB read pressure but adds network hops; without a very high read‑to‑write ratio or stricter latency SLAs, it may not improve response times.
DAX is not a plug‑and‑play speed boost – Proper capacity planning is essential; an undersized node can become a bottleneck.
Caching makes sense when
- The read‑to‑write ratio is extremely high (e.g., > 100:1).
- Latency requirements are sub‑10 ms and the baseline cannot meet them.
- The workload exhibits strong temporal locality that justifies the extra operational complexity.

For most startups operating at modest traffic levels (≤ 200 RPS), the simplest architecture—Lambda directly querying DynamoDB—delivers the best cost‑performance balance. Adding a cache should be driven by concrete latency or cost‑reduction goals, not by the assumption that “caching always helps.”

Cache Warm‑up and Redis Stabilization

Once the cache was warm, Redis stabilized but did not outperform the baseline. In steady state:

p95 latency increased slightly to ~70 ms.
The number of dropped iterations was higher than with DynamoDB alone.

DAX @ 200 RPS: Saturation Under Load

At 200 RPS the DAX configuration began to collapse:

Effective throughput dropped well below the target rate.
p95 latency exceeded one second.
Thousands of iterations were dropped.

This behavior confirms that DAX is highly sensitive to sizing—smaller instances simply do not provide any benefit.

Conclusion: 200 RPS

Even at 200 RPS, the dominant cost in this system was not database access but network and managed‑service overhead. Adding a cache did not remove that cost—it added to it, since you still need to pay for the on‑demand or serverless Redis instance, depending on your choice.

What These Results Actually Prove

DynamoDB on‑demand scales extremely well for simple reads.
Redis reduces pressure on the database, not latency.
Cache warm‑up matters.
A misconfigured DAX is worse than no cache.
Latency optimization and scaling optimization are different problems.

Conclusion & Lessons Learned

The results from both the 50 RPS and 200 RPS benchmarks lead to a clear—and somewhat counter‑intuitive—conclusion: for this workload, Lambda backed by DynamoDB on‑demand was already fast enough that adding a cache did not improve user‑visible latency.

At both load levels, DynamoDB was not the bottleneck.
End‑to‑end latency was dominated by network distance and managed‑service overhead, not database access time.
Introducing Redis added an extra network hop and client‑side overhead without removing the dominant cost in the request path.

Redis still served a purpose, but not the one initially expected. It reduced pressure on DynamoDB and flattened backend load, which can be valuable for cost control and future scaling. What it did not do—at these traffic levels—was make requests faster.

DAX Takeaways

When undersized, DAX simply doesn’t help; it saturates quickly, causing increased latency and dropped requests.
It requires careful capacity planning and a solid understanding of the workflow.

Final Thought

The biggest lesson from this experiment is that measurement before optimization is essential. Even though caching can reduce backend load, it doesn’t automatically lower end‑to‑end latency.

If the workflow is too simple, caching benefits may be invisible. However, when you have a time‑costly operation whose results can be cached, caching is likely worth testing.

In short: Don’t cache because it feels right—cache because the data proves you need it.

Redis vs DynamoDB vs DAX: I Benchmarked AWS Caching Performance (The Results Were Unexpected)

Benchmarking In‑Memory Caching for DynamoDB Reads

Experimental Setup

1. Baseline (DDB + Lambda)

2. DAX (DynamoDB Accelerator)

3. Redis (AWS ElastiCache)

Test Methodology

Results – 50 RPS (Establishing the Baseline)

Interpretation

Conclusion (50 RPS)

Results – 200 RPS (The Expected Crossover That Didn’t Happen)

Interpretation

Conclusion (200 RPS)

Overall Takeaways

Cache Warm‑up and Redis Stabilization

DAX @ 200 RPS: Saturation Under Load

Conclusion: 200 RPS

What These Results Actually Prove

Conclusion & Lessons Learned

DAX Takeaways

Final Thought

Related posts

Introducing nono: A Secure Sandbox for AI Agents

Switch Claude Code providers in seconds with claude-provider (Plugin + CLI)

How to Set Up OpenClaw in 5-10 Minutes (No Mac Mini, No VPS, No Code)

Debugging My Brain: Why Procrastination is Actually an 'Emotional Regulation' Glitch

Benchmarking In‑Memory Caching for DynamoDB Reads

Experimental Setup

1. Baseline (DDB + Lambda)

2. DAX (DynamoDB Accelerator)

3. Redis (AWS ElastiCache)

Test Methodology

Results – 50 RPS (Establishing the Baseline)

Interpretation

Conclusion (50 RPS)

Results – 200 RPS (The Expected Crossover That Didn’t Happen)

Interpretation

Conclusion (200 RPS)

Overall Takeaways

Cache Warm‑up and Redis Stabilization

DAX @ 200 RPS: Saturation Under Load

Conclusion: 200 RPS

What These Results Actually Prove

Conclusion & Lessons Learned

DAX Takeaways

Final Thought

Related posts

Introducing nono: A Secure Sandbox for AI Agents

Switch Claude Code providers in seconds with claude-provider (Plugin + CLI)

How to Set Up OpenClaw in 5-10 Minutes (No Mac Mini, No VPS, No Code)

Debugging My Brain: Why Procrastination is Actually an 'Emotional Regulation' Glitch

1. Baseline (DDB + Lambda)

Results – 50 RPS (Establishing the Baseline)

Conclusion (50 RPS)

Results – 200 RPS (The Expected Crossover That Didn’t Happen)

Conclusion (200 RPS)

DAX @ 200 RPS: Saturation Under Load

Conclusion: 200 RPS