Designing a Scalable, Cost‑Effective Access Pattern for a High‑Throughput Time‑Series Store
Source: Dev.to
Table Schema & Primary Key
| Attribute | Type | Role |
|---|---|---|
deviceId | String | Partition key |
timestamp | String (ISO‑8601, e.g., 2025-12-04T12:34:56Z) | Sort key |
temperature, humidity, pressure | Number | Payload |
metadata | String (JSON) | Optional payload |
ttl | Number (epoch seconds) | TTL attribute for expiration |
Why this PK?
All readings for a device are stored together, enabling efficient range queries (deviceId = X AND timestamp BETWEEN …). A single‑item query for the latest reading can be performed with ScanIndexForward=false and Limit=1.
Indexing Strategy
| Index | Partition Key | Sort Key | Use‑case |
|---|---|---|---|
| Primary Table | deviceId | timestamp | Point lookup & range queries per device |
Global Secondary Index (GSI) – DeviceLatestGSI | deviceId | timestamp (projected as DESC) | Direct query for the latest reading without scanning the whole partition (Limit=1, ScanIndexForward=false) |
Optional GSI – MetricGSI | metricType (e.g., "temperature" constant) | timestamp | Cross‑device time‑range queries for a single metric (rare) |
Note: The primary table already supports the latest‑reading query; the GSI is optional and adds cost only if you anticipate many concurrent “latest” reads that could cause hot‑partition reads on the same deviceId. In most cases the primary table with Limit=1 suffices.
Capacity Mode & Scaling
| Mode | When to use | Configuration |
|---|---|---|
| On‑Demand | Unpredictable spikes, easy start‑up, no need to manage capacity. | Handles 10 k writes/sec automatically; pay per request. |
| Provisioned + Auto Scaling | Predictable traffic, want to control cost. | Start with 15,000 RCUs and 5,000 WCUs (each write ≤ 1 KB consumes 1 WCU). Enable auto‑scaling target 70 % utilization. |
Cost comparison (approx., US East 1, Dec 2025):
- On‑Demand writes: $1.25 per million write request units → ~ $12.5 k/month for 10 k writes/s (≈ 26 M writes/day).
- Provisioned 5,000 WCUs ≈ $0.65 per WCU‑hour → $2.3 k/month plus auto‑scaling buffer.
On‑Demand is simpler; provisioned can be cheaper if traffic is stable.
Mitigating Hot‑Partition Risk
- Uniform
deviceIddistribution: Ensure device IDs are random (e.g., UUID or hashed). - If a few devices dominate traffic: Use sharding – prepend a random shard suffix to
deviceId(e.g.,deviceId#shard01). Store the shard count in a small config table; the application queries all shards and merges results. This spreads write capacity across partitions.
Data Retention (TTL)
Add a numeric attribute ttl = timestampEpoch + 30 days.
Enable DynamoDB TTL on this attribute; DynamoDB automatically deletes expired items (typically within 48 h of expiration).
- No additional Lambda needed, keeping cost low.
Read Performance Optimizations
- Projection: Keep only needed attributes in the GSI (e.g.,
temperature,humidity,pressure,timestamp). This reduces read size and cost. - Consistent vs. eventual reads: Use eventual consistency for most queries (cheaper, 0.5 RCU per 4 KB). For the “latest reading” where freshness is critical, use strongly consistent read (1 RCU per 4 KB).
- BatchGetItem for fetching multiple latest readings across devices in a single call.
Auxiliary Services (optional)
| Service | Purpose |
|---|---|
| AWS Kinesis Data Streams | Buffer inbound sensor data, smooth bursty writes, and feed DynamoDB via a Lambda consumer. |
| AWS Lambda (TTL cleanup) | If deterministic deletion exactly at 30 days is required, a scheduled Lambda can query items with ttl nearing expiration and delete them, but DynamoDB TTL is usually sufficient. |
| Amazon CloudWatch Alarms | Monitor ConsumedWriteCapacityUnits, ThrottledRequests, and SystemErrors to trigger scaling or alerts. |
| AWS Glue / Athena | For ad‑hoc analytics on historical data exported to S3 (via DynamoDB Streams → Lambda → S3). |
Trade‑offs Summary
| Trade‑off | Impact |
|---|---|
| On‑Demand vs. Provisioned | On‑Demand simplifies ops but can be ~30 % more expensive at steady 10 k writes/s. Provisioned requires capacity planning but can be cheaper with auto‑scaling. |
| Sharding vs. Simplicity | Sharding eliminates hot‑partition risk for skewed device traffic but adds complexity in query logic (multiple shards per device). |
| TTL vs. Lambda cleanup | TTL is low‑cost, eventual deletion (up to 48 h delay). Lambda gives precise control but adds compute cost. |
| GSI for latest reading | Guarantees O(1) read latency even under heavy load, but incurs extra write cost (each write updates the GSI). Often unnecessary if Limit=1 on primary table suffices. |
| Strong vs. eventual consistency | Strong reads double read cost; use only where immediate freshness is required. |
Bottom Line
With this design you achieve:
- Fast point‑lookup (
QuerywithdeviceId+Limit=1,ScanIndexForward=false). - Efficient time‑range queries (
QuerywithdeviceIdandtimestamp BETWEEN …). - Automatic 30‑day expiration via DynamoDB TTL.
- Cost‑effective high‑throughput writes using on‑demand or provisioned capacity with auto‑scaling, plus optional sharding to avoid hot partitions.