Observability Practices: Implementing Real-World Monitoring With Python and Prometheus
Source: Dev.to
What Is Observability?
Observability is the ability to understand the internal state of a system based on the data it produces. It is built around three core pillars:
1. Metrics
Numeric values that reflect system state.
Examples: request latency, CPU usage, memory consumption.
2. Logs
Detailed event records generated by applications and systems.
Examples: authentication messages, errors, warnings.
3. Traces
End‑to‑end tracking of requests across services.
Useful in microservices and distributed systems.
Together, these help answer:
- What is happening?
- Why is it happening?
- Where is it failing?
Why Observability Matters
- Detect issues earlier
- Reduce downtime
- Improve performance
- Understand user impact
- Monitor applications at scale
- Make data‑driven decisions
Without observability, debugging becomes slow, reactive, and inconsistent.
Real-World Example: Observability With Python + Prometheus
Install Dependencies
pip install fastapi uvicorn prometheus-client
Python API With Prometheus Metrics
from fastapi import FastAPI
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import Response
import time
import random
app = FastAPI()
REQUEST_COUNT = Counter("api_requests_total", "Total number of API requests received")
REQUEST_LATENCY = Histogram("api_request_latency_seconds", "API request latency")
@app.get("/")
def home():
REQUEST_COUNT.inc()
with REQUEST_LATENCY.time():
time.sleep(random.uniform(0.1, 0.5))
return {"message": "API is running successfully"}
@app.get("/metrics")
def metrics():
return Response(generate_latest(), media_type="text/plain")
Metrics exposed
| Metric | Description |
|---|---|
api_requests_total | Counts all incoming requests |
api_request_latency_seconds | Measures request duration (seconds) |
Prometheus Configuration
Create prometheus.yml:
global:
scrape_interval: 5s
scrape_configs:
- job_name: "python-api"
static_configs:
- targets: ["localhost:8000"]
Prometheus will scrape the metrics endpoint at:
Run Prometheus
./prometheus --config.file=prometheus.yml
Open the Prometheus UI at and query metrics such as:
api_requests_totalrate(api_requests_total[1m])api_request_latency_seconds_bucket
Optional: Grafana Dashboard
Grafana can visualize your Prometheus metrics. Typical graphs include:
- Request rate
- CPU and memory usage
- Error percentage
- Latency percentiles (p95, p99)
Observability Best Practices
- ✔ Instrument every major endpoint – expose metrics for performance‑critical APIs.
- ✔ Standardize metric names – avoid random or unstructured naming.
- ✔ Include labels (tags) – e.g.,
status_code,endpoint,methodfor richer context. - ✔ Use alerts – e.g., “95th percentile latency exceeds 500 ms for 3 minutes.”
- ✔ Visualize everything – dashboards make patterns obvious.
- ✔ Combine logs, metrics, and traces – observability works best when all three pillars are present.
Conclusion
Observability allows teams to deeply understand how their systems behave. Using Prometheus + FastAPI, you can expose useful metrics that support:
- Faster debugging
- Better performance insights
- Safer deployments
- Scalable system monitoring
The example can be expanded with tracing (OpenTelemetry), log pipelines (ELK Stack), or full‑stack observability platforms such as AWS CloudWatch, Datadog, or Azure Monitor.
References
- Prometheus Documentation –
- Grafana Documentation –
- FastAPI –
- OpenTelemetry –