Wired Django, Nextcloud, Grafana, Loki & Prometheus into a secure observability mesh over Tailnet (metrics & logs, dashboards).
Source: Dev.to
Goal
Architecture
Stack Overview
- Prometheus → scrapes metrics from Django and Nextcloud API endpoints
- Loki → ingests logs from both services
- Grafana → visualizes metrics and logs together
- Caddy → reverse proxy with trusted TLS for all endpoints
- Tailnet (Tailscale) → private network with identity‑based access
Everything talks securely — no exposed ports, no unencrypted traffic.
Challenges
- Grafana showed logs but no metrics
- TLS verification issues in Prometheus
- Cross‑service routing
Config Highlights
Prometheus scrape configuration (YAML)
scrape_configs:
- job_name: "django"
metrics_path: /metrics
static_configs:
- targets: ["X.tail.ts.net:8000"]
- job_name: "nextcloud"
metrics_path: /metrics
static_configs:
- targets: ["X.tail.ts.net:8080"]
Both routes sit behind Caddy, which handles TLS termination using trusted Tailnet certificates.
Results
- Correlate logs and metrics per request
- Track uptime and performance trends
- Visualize distributed system behavior across all nodes
It feels like operating my own mini control plane — distributed, secure, and explainable.
Next Steps
- Add distributed tracing (OpenTelemetry)
- Define Prometheus alert rules for critical endpoints
- Automate observability config rollout via CI/CD
Key Takeaway
Observability isn’t an add‑on — it’s the nervous system of your infrastructure.
When your servers start talking, you start listening differently.