Build Log: Shipping a Lean Python Telemetry Agent (CPU, Memory, Disk)

Published: 3 weeks ago (April 8, 2026 at 05:29 AM EDT)

3 min read

Source: Dev.to

Source: Dev.to

Build Log (April 8, 2026)

Implemented the first production‑ready telemetry collectors for heka‑insights‑agent and wired them into the main polling loop.

Added an optimized CPUCollector in src/collectors/cpu.py
Added a MemoryCollector in src/collectors/memory.py
Added a DiskCollector in src/collectors/disk.py
Integrated all collectors into src/main.py with a shared loop
Added environment‑based poll‑interval support via CPU_POLL_INTERVAL_SECONDS
Added python-dotenv to requirements.txt

CPU Collector Design

The CPU collector is built around psutil.cpu_times(...) snapshots and delta math (single source), rather than calling both cpu_percent and cpu_times_percent each cycle.

Key design points

No thread offloading (to_thread) for this workload
First cycle acts as a warm‑up by design
Supports basic and detailed output modes
Optional per‑core output
Uses MonotonicTicker to keep a fixed cadence without drift

Implementation highlights

# src/collectors/cpu.py (excerpt)
cpu_times_snapshot = psutil.cpu_times()
# delta calculation performed on subsequent snapshots

Memory Collection

Memory collection is intentionally lightweight:

One call each to psutil.virtual_memory() and psutil.swap_memory()
Basic mode returns a compact set of key fields
Detailed mode returns the full psutil fields
Raw byte values are preserved (server‑side compute handles transformations)

# src/collectors/memory.py (excerpt)
mem = psutil.virtual_memory()
swap = psutil.swap_memory()

Disk Collection

For disk telemetry, cumulative I/O counters (not rates) are collected because central compute is performed server‑side.

Uses psutil.disk_io_counters(perdisk=True)
Returns aggregate and per‑disk counters
Filters to physical devices only; excludes partitions from the per‑disk payload
Added a device‑name cache with periodic refresh to reduce repeated filtering overhead

# src/collectors/disk.py (excerpt)
disk_io = psutil.disk_io_counters(perdisk=True)

Profiling Summary

Ran a 120‑second profiling session and examined both process stats and cProfile output.

Key findings

Agent CPU cost is very low (near‑idle for this polling interval)
Max RSS ≈ 15 MB
Runtime is dominated by intentional sleep (expected)
Collector costs are small; disk collection is the heaviest of the three

Optimizations applied

Cached the physical‑device list to avoid filtering every cycle
Kept the output shape unchanged (disk_io + disk_io_perdisk)

The agent now has a clean baseline telemetry pipeline with low overhead and clear extension points for transport/shipping.

Next Planned Work

Add payload shipping to a backend endpoint
Implement bounded retry/backoff logic
Write collector‑focused tests

Project Overview

heka‑insights‑agent – a lightweight agent for collecting essential Linux system telemetry and shipping it to a configurable backend.

View on GitHub

Build Log: Shipping a Lean Python Telemetry Agent (CPU, Memory, Disk)

Build Log (April 8, 2026)

CPU Collector Design

Memory Collection

Disk Collection

Profiling Summary

Next Planned Work

Project Overview

Related posts

How to Automate Your Life with Python Scripts - Updated April 12, 2026

Understanding Python Selenium Architecture

SteeplyOpinionated: A PR Review Tool for Tea Recipes (Returns 418, Obviously)

Building a simple async scheduler with generators in Python

Build Log (April 8, 2026)

CPU Collector Design

Memory Collection

Disk Collection

Profiling Summary

Next Planned Work

Project Overview

Related posts

How to Automate Your Life with Python Scripts - Updated April 12, 2026

Understanding Python Selenium Architecture

SteeplyOpinionated: A PR Review Tool for Tea Recipes (Returns 418, Obviously)

Building a simple async scheduler with generators in Python

Build Log (April 8, 2026)