Beating Tail Latency: A Guide to Request Hedging in Go Microservices

Published: 1 month ago (March 13, 2026 at 11:03 PM EDT)

3 min read

Source: Dev.to

Source: Dev.to

In distributed systems, we often talk about “The Long Tail.”
You might have a service where 95 % of requests finish in under 100 ms, but that last 1 % (the P99 latency) can take 2 seconds or more. In a microservice architecture where one user action triggers ten different service calls, a single slow dependency will bottleneck the entire user experience.

Standard retries don’t help here because a “tail latency” request hasn’t failed yet—it’s just slow. Waiting for a 2‑second timeout to trigger a retry wastes time. To beat the long tail, you need Request Hedging (also known as speculative retries).

What is Request Hedging?

The concept is simple but powerful: if a request is taking longer than usual (e.g., longer than the P95 latency), don’t kill it. Instead, start a second, identical request in parallel. Whichever request finishes first, you take its result and cancel the other one.

This speculative approach drastically reduces P99 latency because the probability of two identical requests hitting the long tail simultaneously is extremely low.

The Complexity of Manual Hedging

Implementing hedging manually in Go is a nightmare of goroutine management:

You need a select block with a timer.
You need to coordinate between two (or more) goroutines.
You must ensure that once one succeeds, the others are cancelled immediately to save resources.
You have to handle race conditions where both might succeed at the exact same millisecond.

Most developers end up with hundreds of lines of brittle boilerplate code to handle just one hedged call.

The Resile Way: `DoHedged`

Resile makes request hedging as simple as a single function call. It handles the goroutine lifecycle, context cancellation, and race conditions for you.

import "github.com/cinar/resile"

data, err := resile.DoHedged(
    ctx,
    func(ctx context.Context) (*User, error) {
        return apiClient.GetUser(ctx, userID)
    },
    resile.WithMaxAttempts(3),
    resile.WithHedgingDelay(100 * time.Millisecond),
)

What happens under the hood?

Resile starts the first request.
It waits for the configured HedgingDelay (e.g., 100 ms).
If the first request hasn’t finished, it starts a second request.
As soon as one returns a successful result, Resile cancels the context of the other request and returns the data to you.

Picking the Right Hedging Delay

The “magic” of hedging lies in the delay.

Too short: you double your traffic unnecessarily, adding extra load to downstream services.
Too long: you don’t gain much latency benefit.

Pro‑Tip: set HedgingDelay to your P95 or P99 latency. This ensures you only hedge the slowest 1‑5 % of requests, delivering a massive latency win with minimal extra load.

Observability: Tracking the “Speculative” Wins

If you’re using Resile’s OpenTelemetry integration (telemetry/resileotel), you can see these wins in your distributed traces. Each hedged attempt is recorded as a sub‑span; when a hedged request wins, the first span is cancelled and the second succeeds, providing clear proof that hedging saved a user from a 2‑second wait.

Conclusion

Request hedging used to be a technique reserved for companies with massive infrastructure teams. With Resile, it’s a tool that every Go developer can use to build snappier, more resilient microservices.

By moving from “Wait and Retry” to “Hedge and Win,” you can turn your long‑tail latency into a competitive advantage.

Give Resile a star on GitHub:

Beating Tail Latency: A Guide to Request Hedging in Go Microservices

What is Request Hedging?

The Complexity of Manual Hedging

The Resile Way: `DoHedged`

What happens under the hood?

Picking the Right Hedging Delay

Observability: Tracking the “Speculative” Wins

Conclusion

Related posts

Jemalloc un-abandoned by Meta

Meta’s renewed commitment to jemalloc

Lazycut: A simple terminal video trimmer using FFmpeg

Linux 7.1 to Retire UDP-Lite – Allows for Better Performance with Cleansed Code

What is Request Hedging?

The Complexity of Manual Hedging

The Resile Way: DoHedged

What happens under the hood?

Picking the Right Hedging Delay

Observability: Tracking the “Speculative” Wins

Conclusion

Related posts

Jemalloc un-abandoned by Meta

Meta’s renewed commitment to jemalloc

Lazycut: A simple terminal video trimmer using FFmpeg

Linux 7.1 to Retire UDP-Lite – Allows for Better Performance with Cleansed Code

The Resile Way: `DoHedged`