Designing a Scalable, Cost‑Effective Access Pattern for a High‑Throughput Time‑Series Store

Published: 2 months ago (December 4, 2025 at 04:39 AM EST)

3 min read

Source: Dev.to

Table Schema & Primary Key

Attribute	Type	Role
`deviceId`	String	Partition key
`timestamp`	String (ISO‑8601, e.g., `2025-12-04T12:34:56Z`)	Sort key
`temperature`, `humidity`, `pressure`	Number	Payload
`metadata`	String (JSON)	Optional payload
`ttl`	Number (epoch seconds)	TTL attribute for expiration

Why this PK?
All readings for a device are stored together, enabling efficient range queries (deviceId = X AND timestamp BETWEEN …). A single‑item query for the latest reading can be performed with ScanIndexForward=false and Limit=1.

Indexing Strategy

Index	Partition Key	Sort Key	Use‑case
Primary Table	`deviceId`	`timestamp`	Point lookup & range queries per device
Global Secondary Index (GSI) – `DeviceLatestGSI`	`deviceId`	`timestamp` (projected as DESC)	Direct query for the latest reading without scanning the whole partition (`Limit=1`, `ScanIndexForward=false`)
Optional GSI – `MetricGSI`	`metricType` (e.g., `"temperature"` constant)	`timestamp`	Cross‑device time‑range queries for a single metric (rare)

Note: The primary table already supports the latest‑reading query; the GSI is optional and adds cost only if you anticipate many concurrent “latest” reads that could cause hot‑partition reads on the same deviceId. In most cases the primary table with Limit=1 suffices.

Capacity Mode & Scaling

Mode	When to use	Configuration
On‑Demand	Unpredictable spikes, easy start‑up, no need to manage capacity.	Handles 10 k writes/sec automatically; pay per request.
Provisioned + Auto Scaling	Predictable traffic, want to control cost.	Start with 15,000 RCUs and 5,000 WCUs (each write ≤ 1 KB consumes 1 WCU). Enable auto‑scaling target 70 % utilization.

Cost comparison (approx., US East 1, Dec 2025):

On‑Demand writes: $1.25 per million write request units → ~ $12.5 k/month for 10 k writes/s (≈ 26 M writes/day).
Provisioned 5,000 WCUs ≈ $0.65 per WCU‑hour → $2.3 k/month plus auto‑scaling buffer.

On‑Demand is simpler; provisioned can be cheaper if traffic is stable.

Mitigating Hot‑Partition Risk

Uniform deviceId distribution: Ensure device IDs are random (e.g., UUID or hashed).
If a few devices dominate traffic: Use sharding – prepend a random shard suffix to deviceId (e.g., deviceId#shard01). Store the shard count in a small config table; the application queries all shards and merges results. This spreads write capacity across partitions.

Data Retention (TTL)

Add a numeric attribute ttl = timestampEpoch + 30 days.
Enable DynamoDB TTL on this attribute; DynamoDB automatically deletes expired items (typically within 48 h of expiration).

No additional Lambda needed, keeping cost low.

Read Performance Optimizations

Projection: Keep only needed attributes in the GSI (e.g., temperature, humidity, pressure, timestamp). This reduces read size and cost.
Consistent vs. eventual reads: Use eventual consistency for most queries (cheaper, 0.5 RCU per 4 KB). For the “latest reading” where freshness is critical, use strongly consistent read (1 RCU per 4 KB).
BatchGetItem for fetching multiple latest readings across devices in a single call.

Auxiliary Services (optional)

Service	Purpose
AWS Kinesis Data Streams	Buffer inbound sensor data, smooth bursty writes, and feed DynamoDB via a Lambda consumer.
AWS Lambda (TTL cleanup)	If deterministic deletion exactly at 30 days is required, a scheduled Lambda can query items with `ttl` nearing expiration and delete them, but DynamoDB TTL is usually sufficient.
Amazon CloudWatch Alarms	Monitor `ConsumedWriteCapacityUnits`, `ThrottledRequests`, and `SystemErrors` to trigger scaling or alerts.
AWS Glue / Athena	For ad‑hoc analytics on historical data exported to S3 (via DynamoDB Streams → Lambda → S3).

Trade‑offs Summary

Trade‑off	Impact
On‑Demand vs. Provisioned	On‑Demand simplifies ops but can be ~30 % more expensive at steady 10 k writes/s. Provisioned requires capacity planning but can be cheaper with auto‑scaling.
Sharding vs. Simplicity	Sharding eliminates hot‑partition risk for skewed device traffic but adds complexity in query logic (multiple shards per device).
TTL vs. Lambda cleanup	TTL is low‑cost, eventual deletion (up to 48 h delay). Lambda gives precise control but adds compute cost.
GSI for latest reading	Guarantees O(1) read latency even under heavy load, but incurs extra write cost (each write updates the GSI). Often unnecessary if `Limit=1` on primary table suffices.
Strong vs. eventual consistency	Strong reads double read cost; use only where immediate freshness is required.

Bottom Line

With this design you achieve:

Fast point‑lookup (Query with deviceId + Limit=1, ScanIndexForward=false).
Efficient time‑range queries (Query with deviceId and timestamp BETWEEN …).
Automatic 30‑day expiration via DynamoDB TTL.
Cost‑effective high‑throughput writes using on‑demand or provisioned capacity with auto‑scaling, plus optional sharding to avoid hot partitions.

Designing a Scalable, Cost‑Effective Access Pattern for a High‑Throughput Time‑Series Store

Table Schema & Primary Key

Indexing Strategy

Capacity Mode & Scaling

Mitigating Hot‑Partition Risk

Data Retention (TTL)

Read Performance Optimizations

Auxiliary Services (optional)

Trade‑offs Summary

Bottom Line

Related posts

Secret scanning updates — November 2025

Show HN: I built a system for active note-taking in regular meetings like 1-1s

Icons in Menus Everywhere – Send Help

Enterprise teams product limits increased by over 10x