What do you actually check in the first 15 minutes after deploy?

Published: 1 month ago (March 10, 2026 at 03:40 PM EDT)

3 min read

Source: Dev.to

Source: Dev.to

Background

CI passed.
The deploy finished.
Nothing is obviously broken.

And yet, for a few minutes after release, production still feels uncertain. I think this is one of the most awkward parts of shipping software.

A deployment can be technically successful:

build passes
tests pass
pipeline passes
container starts
health checks look fine

But real‑runtime problems can still show up only after actual traffic hits the system. That creates a weird gap between deploy success and runtime confidence.

In many smaller teams, the first few minutes after a deploy look something like this:

Open logs
Check recent exceptions
Watch for error spikes
Compare current noise with what “normal” felt like before
Decide whether to ignore, investigate, or roll back

We have plenty of tools for detection (exceptions, timeouts, retries, latency spikes, failed external API calls, degraded endpoints), but detection is not the same as judgment. The real post‑deploy question is usually:

Did this deploy actually make things worse?

And then:

Does it need attention right now?

That second layer still feels surprisingly manual.

If you have mature release control, canary rollouts, feature flags, and a strong observability setup, the uncertainty window is probably much smaller. But many teams do not have all of that, and even when they do, someone still has to interpret what production is actually saying after a release.

The key is not just “can we collect signals?” but:

Which signals matter most right after deploy?
How do you compare them against normal behavior?
How do you tell noise from regression?
What gives you enough confidence to say “this deploy is fine”?
What makes you stop and investigate immediately?

When I think about the first 10–15 minutes after a deploy, I usually care less about giant dashboards and more about a small number of judgment signals:

Did new runtime exceptions appear?
Did existing exception patterns get worse?
Are failures concentrated on one service or API path?
Is the error pattern meaningfully different from recent baseline behavior?
Does this look transient, or does it look deploy‑related?

That feels like a different problem from general monitoring; it’s closer to post‑deploy runtime diagnosis.

My Approach

This line of thinking led me to start building Relivio. The idea is narrow: not a full observability platform, just a focused way to answer “Is this deploy safe, or does it need attention?”

Minimal FastAPI demo:
Main project site:

Discussion Questions

What do you actually check in the first 10–15 minutes after a deploy?
Do you rely mostly on logs, alerts, dashboards, release views, or something else?
What signal makes you think “this deploy is probably bad”?
What signal makes you confident enough to leave it alone?
If you already have a strong workflow for this, what does it look like?
If you do not, what part still feels manual or annoying?

I’m especially interested in answers from small teams and side projects, because that is where this still feels the most human and least automated. If you think this is already solved well by your current stack, I’d like to hear that too. And if you think the problem isn’t painful enough to deserve a dedicated tool, I’d genuinely like to know why.

What do you actually check in the first 15 minutes after deploy?

Background

My Approach

Discussion Questions

Related posts

Why Open Source AI Tools Are Quietly Winning

Travigo

Trust Debt: The Production Crisis Hidden Inside AI-Generated Codebases

Micro games