Build Log: Shipping a Lean Python Telemetry Agent (CPU, Memory, Disk)
Source: Dev.to
Build Log (April 8, 2026)
Implemented the first production‑ready telemetry collectors for heka‑insights‑agent and wired them into the main polling loop.
- Added an optimized
CPUCollectorinsrc/collectors/cpu.py - Added a
MemoryCollectorinsrc/collectors/memory.py - Added a
DiskCollectorinsrc/collectors/disk.py - Integrated all collectors into
src/main.pywith a shared loop - Added environment‑based poll‑interval support via
CPU_POLL_INTERVAL_SECONDS - Added
python-dotenvtorequirements.txt
CPU Collector Design
The CPU collector is built around psutil.cpu_times(...) snapshots and delta math (single source), rather than calling both cpu_percent and cpu_times_percent each cycle.
Key design points
- No thread offloading (
to_thread) for this workload - First cycle acts as a warm‑up by design
- Supports basic and detailed output modes
- Optional per‑core output
- Uses
MonotonicTickerto keep a fixed cadence without drift
Implementation highlights
# src/collectors/cpu.py (excerpt)
cpu_times_snapshot = psutil.cpu_times()
# delta calculation performed on subsequent snapshotsMemory Collection
Memory collection is intentionally lightweight:
- One call each to
psutil.virtual_memory()andpsutil.swap_memory() - Basic mode returns a compact set of key fields
- Detailed mode returns the full
psutilfields - Raw byte values are preserved (server‑side compute handles transformations)
# src/collectors/memory.py (excerpt)
mem = psutil.virtual_memory()
swap = psutil.swap_memory()Disk Collection
For disk telemetry, cumulative I/O counters (not rates) are collected because central compute is performed server‑side.
- Uses
psutil.disk_io_counters(perdisk=True) - Returns aggregate and per‑disk counters
- Filters to physical devices only; excludes partitions from the per‑disk payload
- Added a device‑name cache with periodic refresh to reduce repeated filtering overhead
# src/collectors/disk.py (excerpt)
disk_io = psutil.disk_io_counters(perdisk=True)Profiling Summary
Ran a 120‑second profiling session and examined both process stats and cProfile output.
Key findings
- Agent CPU cost is very low (near‑idle for this polling interval)
- Max RSS ≈ 15 MB
- Runtime is dominated by intentional sleep (expected)
- Collector costs are small; disk collection is the heaviest of the three
Optimizations applied
- Cached the physical‑device list to avoid filtering every cycle
- Kept the output shape unchanged (
disk_io+disk_io_perdisk)
The agent now has a clean baseline telemetry pipeline with low overhead and clear extension points for transport/shipping.
Next Planned Work
- Add payload shipping to a backend endpoint
- Implement bounded retry/backoff logic
- Write collector‑focused tests
Project Overview
heka‑insights‑agent – a lightweight agent for collecting essential Linux system telemetry and shipping it to a configurable backend.