High-Throughput IoT Log Aggregator

Published: (December 25, 2025 at 02:47 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Cover image for High-Throughput IoT Log Aggregator

Imagine an industrial monitoring system receiving telemetry packets from thousands of sensors every second. The system must:

  • Ingest a batch of raw data packets.
  • Filter inactive sensors.
  • Aggregate temperature readings by Device ID.
  • Generate a textual summary log for the dashboard.

The challenge is that the process runs continuously, so any inefficiency (e.g., unnecessary memory allocation) can cause garbage‑collection pauses and data loss. The solution uses memory‑efficient patterns in Go.

System Flow Diagram

flowchart LR
    A[Raw Data Ingestion] -->|Slice Pre‑allocation| B[Batch Processing]
    B -->|Value Semantics| C{Filter Inactive}
    C -->|Map Pre‑allocation| D[Aggregation]
    D -->|strings.Builder| E[Log Generation]
    E --> F[Final Report]

Optimization Techniques

Struct Alignment

type SensorPacket struct {
    Timestamp int64   // 8 bytes
    Value     float64 // 8 bytes
    DeviceID  int32   // 4 bytes
    Active    bool    // 1 byte
    // 3 bytes padding added automatically
}

Saving 8 bytes per packet saves ~8 MB of RAM per million records.

Slice Pre‑allocation

packets := make([]SensorPacket, 0, n) // n = expected batch size

Loading 100 000 items without pre‑allocation would cause ~18 resizes and many memory copies.

Map Size Hint

agg := make(map[int32]float64, 100) // anticipate ~100 devices

Pre‑allocating buckets prevents expensive rehashing when the map grows.

strings.Builder

var sb strings.Builder
sb.WriteString("Device ")
sb.WriteString(strconv.Itoa(id))
sb.WriteString(": Avg Temp ")
sb.WriteString(fmt.Sprintf("%.2f", avg))

Using a builder avoids creating hundreds of temporary strings.

Value vs Pointer

func processBatch(cfg Config, data []SensorPacket) *Report {
    // cfg passed by value (fast stack access)
    // Report returned as a pointer to avoid copying the large map
}

Memory Layout Comparison

Optimized (Current Code)

[ Timestamp (8) ] [ Value (8) ] [ DeviceID (4) | Active (1) | Pad (3) ]
Total: 24 Bytes / Block

Unoptimized (Mixed)

[ Active (1) | Pad (7) ] [ Timestamp (8) ] [ DeviceID (4) | Pad (4) ] [ Value (8) ]
Total: 32 Bytes / Block (33% wasted memory!)

Example Results

--- Processing Complete in 6.5627ms ---
--- BATCH REPORT ---
Batch ID: 1766689634
Device 79: Avg Temp 44.52
Device 46: Avg Temp 46.42
Device 57: Avg Temp 45.37
Device 11: Avg Temp 44.54
Device 15: Avg Temp 46.43
... (truncated)

Benchmark Results

OperationImplementationTime (ns/op)Memory (B/op)Allocations (op)Performance Gain
Slice AppendInefficient66,035357,62619
Efficient (Pre‑alloc)15,87381,9201~4.1× Faster
String BuildInefficient (+)8,72721,08099
Efficient (Builder)244.74161~35.6× Faster
Map InsertInefficient571,279591,48579
Efficient (Size hint)206,910295,55433~2.7× Faster
Struct PassBy Value (Copy)0.2600
By Pointer (Ref)0.2500Similar

Note on structs: In micro‑benchmarks the Go compiler inlines calls, making the difference between passing by value and by pointer negligible. In real‑world code with deeper call stacks, passing large structs by pointer can noticeably reduce CPU usage.

Key Takeaways

  • String Concatenation: Avoid + in loops. strings.Builder is >35× faster and uses 98 % less memory by eliminating intermediate garbage strings.
  • Memory Pre‑allocation: Providing capacity up‑front for slices and maps removes the overhead of repeated resizing and rehashing.
  • Allocation Count Matters: Fewer allocations (allocs/op) mean less work for the garbage collector, leading to a more stable and responsive application.
  • Slice: Reduced allocations from 19 → 1.
  • Map: Reduced allocations from 79 → 33.

These patterns collectively keep the IoT log aggregator performant under high‑throughput conditions.

Back to Blog

Related posts

Read more »