Observability Practices: Implementing Real-World Monitoring With Python and Prometheus

Published: (December 3, 2025 at 09:19 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Introduction

Modern applications don’t just need to run — they need to be understood. When something goes wrong in production, teams must be able to detect issues, diagnose the root cause, and monitor the system’s behavior in real time. This is where observability becomes essential.

What Is Observability?

Observability is the ability to understand the internal state of a system based on the data it produces. It is built around three core pillars:

1. Metrics

Numeric values that reflect system state.
Examples: request latency, CPU usage, memory consumption.

2. Logs

Detailed event records generated by applications and systems.
Examples: authentication messages, errors, warnings.

3. Traces

End‑to‑end tracking of requests across services.
Useful in microservices and distributed systems.

Together, these help answer:

  • What is happening?
  • Why is it happening?
  • Where is it failing?

Why Observability Matters

  • Detect issues earlier
  • Reduce downtime
  • Improve performance
  • Understand user impact
  • Monitor applications at scale
  • Make data‑driven decisions

Without observability, debugging becomes slow, reactive, and inconsistent.

Real-World Example: Observability With Python + Prometheus

1. Install Dependencies

pip install fastapi uvicorn prometheus-client

2. Python API With Prometheus Metrics

from fastapi import FastAPI
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import Response
import time
import random

app = FastAPI()

REQUEST_COUNT = Counter("api_requests_total", "Total number of API requests received")
REQUEST_LATENCY = Histogram("api_request_latency_seconds", "API request latency")

@app.get("/")
def home():
    REQUEST_COUNT.inc()
    with REQUEST_LATENCY.time():
        time.sleep(random.uniform(0.1, 0.5))
        return {"message": "API is running successfully"}

@app.get("/metrics")
def metrics():
    return Response(generate_latest(), media_type="text/plain")

Metrics exposed

MetricDescription
api_requests_totalCounts all incoming requests
api_request_latency_secondsMeasures request duration (seconds)

3. Prometheus Configuration

Create prometheus.yml:

global:
  scrape_interval: 5s

scrape_configs:
  - job_name: "python-api"
    static_configs:
      - targets: ["localhost:8000"]

Prometheus will scrape the metrics endpoint at the configured target.

4. Run Prometheus

./prometheus --config.file=prometheus.yml

Open the Prometheus UI and query metrics such as:

  • api_requests_total
  • rate(api_requests_total[1m])
  • api_request_latency_seconds_bucket

5. Optional: Grafana Dashboard

Grafana can visualize your Prometheus metrics. Typical graphs include:

  • Request rate
  • CPU and memory usage
  • Error percentage
  • Latency percentiles (p95, p99)

Observability Best Practices

  • ✔ Instrument every major endpoint – expose metrics for performance‑critical APIs.
  • ✔ Standardize metric names – avoid random or unstructured naming.
  • ✔ Include labels (tags) – e.g., status_code, endpoint, method for richer context.
  • ✔ Use alerts – e.g., “95th percentile latency exceeds 500 ms for 3 minutes.”
  • ✔ Visualize everything – dashboards make patterns obvious.
  • ✔ Combine logs, metrics, and traces – observability works best when all three pillars are present.

Conclusion

Observability allows teams to deeply understand how their systems behave. Using Prometheus + FastAPI, you can expose useful metrics that support:

  • Faster debugging
  • Better performance insights
  • Safer deployments
  • Scalable system monitoring

The example can be expanded with tracing (OpenTelemetry), log pipelines (ELK Stack), or full‑stack observability platforms such as AWS CloudWatch, Datadog, or Azure Monitor.

References

  • Prometheus Documentation –
  • Grafana Documentation –
  • FastAPI –
  • OpenTelemetry –
Back to Blog

Related posts

Read more »