High-Throughput IoT Log Aggregator
Source: Dev.to

Imagine an industrial monitoring system receiving telemetry packets from thousands of sensors every second. The system must:
- Ingest a batch of raw data packets.
- Filter inactive sensors.
- Aggregate temperature readings by Device ID.
- Generate a textual summary log for the dashboard.
The challenge is that the process runs continuously, so any inefficiency (e.g., unnecessary memory allocation) can cause garbage‑collection pauses and data loss. The solution uses memory‑efficient patterns in Go.
System Flow Diagram
flowchart LR
A[Raw Data Ingestion] -->|Slice Pre‑allocation| B[Batch Processing]
B -->|Value Semantics| C{Filter Inactive}
C -->|Map Pre‑allocation| D[Aggregation]
D -->|strings.Builder| E[Log Generation]
E --> F[Final Report]
Optimization Techniques
Struct Alignment
type SensorPacket struct {
Timestamp int64 // 8 bytes
Value float64 // 8 bytes
DeviceID int32 // 4 bytes
Active bool // 1 byte
// 3 bytes padding added automatically
}
Saving 8 bytes per packet saves ~8 MB of RAM per million records.
Slice Pre‑allocation
packets := make([]SensorPacket, 0, n) // n = expected batch size
Loading 100 000 items without pre‑allocation would cause ~18 resizes and many memory copies.
Map Size Hint
agg := make(map[int32]float64, 100) // anticipate ~100 devices
Pre‑allocating buckets prevents expensive rehashing when the map grows.
strings.Builder
var sb strings.Builder
sb.WriteString("Device ")
sb.WriteString(strconv.Itoa(id))
sb.WriteString(": Avg Temp ")
sb.WriteString(fmt.Sprintf("%.2f", avg))
Using a builder avoids creating hundreds of temporary strings.
Value vs Pointer
func processBatch(cfg Config, data []SensorPacket) *Report {
// cfg passed by value (fast stack access)
// Report returned as a pointer to avoid copying the large map
}
Memory Layout Comparison
Optimized (Current Code)
[ Timestamp (8) ] [ Value (8) ] [ DeviceID (4) | Active (1) | Pad (3) ]
Total: 24 Bytes / Block
Unoptimized (Mixed)
[ Active (1) | Pad (7) ] [ Timestamp (8) ] [ DeviceID (4) | Pad (4) ] [ Value (8) ]
Total: 32 Bytes / Block (33% wasted memory!)
Example Results
--- Processing Complete in 6.5627ms ---
--- BATCH REPORT ---
Batch ID: 1766689634
Device 79: Avg Temp 44.52
Device 46: Avg Temp 46.42
Device 57: Avg Temp 45.37
Device 11: Avg Temp 44.54
Device 15: Avg Temp 46.43
... (truncated)
Benchmark Results
| Operation | Implementation | Time (ns/op) | Memory (B/op) | Allocations (op) | Performance Gain |
|---|---|---|---|---|---|
| Slice Append | Inefficient | 66,035 | 357,626 | 19 | – |
| Efficient (Pre‑alloc) | 15,873 | 81,920 | 1 | ~4.1× Faster | |
| String Build | Inefficient (+) | 8,727 | 21,080 | 99 | – |
| Efficient (Builder) | 244.7 | 416 | 1 | ~35.6× Faster | |
| Map Insert | Inefficient | 571,279 | 591,485 | 79 | – |
| Efficient (Size hint) | 206,910 | 295,554 | 33 | ~2.7× Faster | |
| Struct Pass | By Value (Copy) | 0.26 | 0 | 0 | – |
| By Pointer (Ref) | 0.25 | 0 | 0 | Similar |
Note on structs: In micro‑benchmarks the Go compiler inlines calls, making the difference between passing by value and by pointer negligible. In real‑world code with deeper call stacks, passing large structs by pointer can noticeably reduce CPU usage.
Key Takeaways
- String Concatenation: Avoid
+in loops.strings.Builderis >35× faster and uses 98 % less memory by eliminating intermediate garbage strings. - Memory Pre‑allocation: Providing capacity up‑front for slices and maps removes the overhead of repeated resizing and rehashing.
- Allocation Count Matters: Fewer allocations (
allocs/op) mean less work for the garbage collector, leading to a more stable and responsive application. - Slice: Reduced allocations from 19 → 1.
- Map: Reduced allocations from 79 → 33.
These patterns collectively keep the IoT log aggregator performant under high‑throughput conditions.