Observability Practices: Implementing Real-World Monitoring With Python and Prometheus

Published: (December 3, 2025 at 09:19 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

What Is Observability?

Observability is the ability to understand the internal state of a system based on the data it produces. It is built around three core pillars:

1. Metrics

Numeric values that reflect system state.
Examples: request latency, CPU usage, memory consumption.

2. Logs

Detailed event records generated by applications and systems.
Examples: authentication messages, errors, warnings.

3. Traces

End‑to‑end tracking of requests across services.
Useful in microservices and distributed systems.

Together, these help answer:

  • What is happening?
  • Why is it happening?
  • Where is it failing?

Why Observability Matters

  • Detect issues earlier
  • Reduce downtime
  • Improve performance
  • Understand user impact
  • Monitor applications at scale
  • Make data‑driven decisions

Without observability, debugging becomes slow, reactive, and inconsistent.

Real-World Example: Observability With Python + Prometheus

Install Dependencies

pip install fastapi uvicorn prometheus-client

Python API With Prometheus Metrics

from fastapi import FastAPI
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import Response
import time
import random

app = FastAPI()

REQUEST_COUNT = Counter("api_requests_total", "Total number of API requests received")
REQUEST_LATENCY = Histogram("api_request_latency_seconds", "API request latency")

@app.get("/")
def home():
    REQUEST_COUNT.inc()
    with REQUEST_LATENCY.time():
        time.sleep(random.uniform(0.1, 0.5))
        return {"message": "API is running successfully"}

@app.get("/metrics")
def metrics():
    return Response(generate_latest(), media_type="text/plain")

Metrics exposed

MetricDescription
api_requests_totalCounts all incoming requests
api_request_latency_secondsMeasures request duration (seconds)

Prometheus Configuration

Create prometheus.yml:

global:
  scrape_interval: 5s

scrape_configs:
  - job_name: "python-api"
    static_configs:
      - targets: ["localhost:8000"]

Prometheus will scrape the metrics endpoint at:

Run Prometheus

./prometheus --config.file=prometheus.yml

Open the Prometheus UI at and query metrics such as:

  • api_requests_total
  • rate(api_requests_total[1m])
  • api_request_latency_seconds_bucket

Optional: Grafana Dashboard

Grafana can visualize your Prometheus metrics. Typical graphs include:

  • Request rate
  • CPU and memory usage
  • Error percentage
  • Latency percentiles (p95, p99)

Observability Best Practices

  • ✔ Instrument every major endpoint – expose metrics for performance‑critical APIs.
  • ✔ Standardize metric names – avoid random or unstructured naming.
  • ✔ Include labels (tags) – e.g., status_code, endpoint, method for richer context.
  • ✔ Use alerts – e.g., “95th percentile latency exceeds 500 ms for 3 minutes.”
  • ✔ Visualize everything – dashboards make patterns obvious.
  • ✔ Combine logs, metrics, and traces – observability works best when all three pillars are present.

Conclusion

Observability allows teams to deeply understand how their systems behave. Using Prometheus + FastAPI, you can expose useful metrics that support:

  • Faster debugging
  • Better performance insights
  • Safer deployments
  • Scalable system monitoring

The example can be expanded with tracing (OpenTelemetry), log pipelines (ELK Stack), or full‑stack observability platforms such as AWS CloudWatch, Datadog, or Azure Monitor.

References

  • Prometheus Documentation –
  • Grafana Documentation –
  • FastAPI –
  • OpenTelemetry –
Back to Blog

Related posts

Read more »