Retrying HTTP Requests in Go Without Making It Worse

Published: (May 26, 2026 at 05:15 PM EDT)
8 min read
Source: Dev.to

Source: Dev.to

When you call an external API, things go fine until they don’t

A network blip, a server restart, a rate limit. So you add a retry, and most of the time it helps. The problem is that the obvious retry—the one we all write first—can quietly make things worse: it can resend a payment twice, or keep a struggling server down longer than the original failure would have.

In this post we’ll build a retry client in Go from the naive version up, fix each thing that bites, and end somewhere a little uncomfortable: telling you to use a library instead. By the end you’ll understand exactly what that library is doing, which is the real reason to read this.

The retry everyone writes first

Three tries, wait a second, give up. It looks reasonable, and there are three problems hiding in it.

  1. The body. An HTTP client reads the request body to the end when it sends it, and it doesn’t rewind. So on the second attempt there’s nothing left to send, the server gets an empty POST, and it returns 400. Now you’re retrying a 400 that you caused.
  2. The wait is the same for everyone. When a server starts returning 503s, every client retries on the same one‑second tick, together, over and over. You haven’t given it room to recover. You’ve lined everyone up to hit it again at the same moment. This is the thundering herd.
  3. You’re retrying everything. A 400 means your request is wrong. A 404 means the thing isn’t there. No amount of retrying changes either, but this loop treats every failure the same.

Three problems, and they don’t get fixed in the same place. Let’s take them one at a time.

Backoff: don’t retry in lockstep

The wait between attempts should grow, and it shouldn’t be identical for every client. Growing is exponential backoff. Not‑identical is jitter. You want both.

Two details here are worth slowing down on, because they’re the ones people get wrong.

  • I used a left shift instead of math.Pow(2, attempt). Floats are imprecise for this, and Base * 2^attempt overflows an int64 of nanoseconds sooner than you’d expect, which can wrap a careful 800 ms wait into a negative number. Shifting integers and capping early avoids all of it.

  • Jitter is a wrapper, not a second copy of the exponential math. The tempting move is to write a separate ExponentialBackoffWithJitter type and duplicate the doubling inside it. Then your linear and constant strategies don’t get jitter, and you’ve got the same logic in two places. Wrapping any Backoff keeps it in one place.

    One more thing: this is full jitter, random(0, d), not “the wait give or take 25 %”. The difference is the whole point. If every client jitters in a narrow band around the same target, they stay bunched together, just a slightly wider bunch. Picking uniformly across the entire [0, d) window is what actually scatters them.

    If you want the simulations behind that, see AWS’s Exponential Backoff and Jitter post for the graphs.

Make sleep mockable, and make it respect context

Here’s a practical problem you hit the moment you write a test. If the test actually sleeps through exponential backoff, it actually takes seconds, and a suite full of them crawls.

So don’t call time.Sleep directly. Put it behind an interface, and while you’re there, make it cancellable.

That select on ctx.Done() is the part most hand‑rolled retry loops miss. They check the context once at the top of the loop, then block on a plain time.Sleep for ten seconds. If the request is cancelled during that wait, nothing happens until the sleep ends. With the timer‑and‑context version, a cancelled request stops right away. On a deploy, that’s the difference between a clean shutdown and a thirty‑second hang.

The client, where everything comes together

Let’s walk through the decisions, because they’re the whole point.

  • Body handling. We buffer the body once, up front, and only when it’s a raw stream. Requests built with http.NewRequest from a bytes.Reader or a string already carry GetBody, so we reuse it instead of reading anything. We also don’t mutate the request you passed in. (This does read the whole body into memory, so if you’re streaming a very large upload, know that retries and streaming pull in opposite directions, and pick one on purpose.)
  • Retry policy. The default policy won’t retry your POST, and this is the one I’d most want you to take away. A POST that timed out may have already succeeded on the server before the response got lost on the way back. Retry it blindly and you’ve charged the customer twice. So GET, HEAD, PUT and the other safe‑to‑repeat methods retry by default, and POST doesn’t unless you opt in with an idempotency key. The policy gets the request so it can actually make that distinction.
  • Retry-After handling. It is respected and capped. If a server says “wait two seconds”, we wait two seconds. If a misconfigured server says “wait 86400”, we don’t park the goroutine for a day. We also handle the HTTP‑date form of the header, not just delta‑seconds, since the spec allows both and the date form is the one that usually gets forgotten.
  • Return value. When we finally give up, we hand back the last response with its body still open, plus an error. An earlier version of mine closed that body before returning it, which left the caller holding a response it couldn’t read. Close it yourself and check the status if you want to know why we stopped.

Now the tests run in microseconds

Because the sleeper is mocked, we can assert the exact backoff schedule, prove Retry-After overrides the backoff, prove a cancelled context stops immediately, and prove a POST is left alone. None of it waits on a real clock. That last test—the one that pins the POST behaviour—is the one I’d be most nervous shipping without.

You probably shouldn’t use any of this

Here’s the uncomfortable turn. For real work, reach for a library. go-retryablehttp does all of this and more, battle‑tested and maintained. The point of this post is to show what such a library does under the hood so you can use it confidently, or roll your own with eyes open.

Why Not Use an Existing Library?

hashicorp/go-retryablehttp buffers bodies, respects Retry‑After, does exponential backoff with jitter, and is wired into Terraform and Vault—so it has survived far more abuse than anything you or I will write this week.
If you want a full HTTP client with a chainable API and JSON handling on top, resty has retries built in.

So why build it at all?

Because now you can open that library’s source and read it as a set of decisions instead of magic:

  • Why it buffers
  • Why it clones the request
  • Why its default policy is careful about methods

The day it does something surprising, you’re debugging a thing you understand. That was the deliverable all along – not the code, the understanding.

What Actually Bites People

A few things that aren’t obvious until they catch you.

1. Idempotency

Retrying is safe until the operation has a side effect, and then it isn’t.

  • Make the operation idempotent,
  • Send an idempotency key, or
  • Don’t retry it.

There’s no fourth choice that ends well.

2. Logging

The instinct is to log the full request and response on every retry, and then your logs fill with duplicated payloads—some of them carrying tokens.

Log only:

  • Attempt number
  • Status code
  • Endpoint

Do not log the body.

3. Timeouts

Retries do not replace timeouts.

  • The HTTP client timeout is per attempt.
  • Your overall deadline belongs in the context.

These are two different clocks; you want both, or one slow call will hold your whole retry loop hostage.

4. Chained Retries (the one worth saying out loud)

If Service A retries Service B three times, and Service B retries Service C three times, a small hiccup in C arrives at C as nine requests. Stack a few layers and you get a retry storm that keeps a system down long after the original cause is gone.

Fixes

  • Retry budget – cap retries to a small fraction of your request rate so a bad minute can’t multiply into an outage.
  • Single‑server jitter – handles the herd.
  • Budget – handles the chain.

Real systems need both.

Next Steps

The full implementation is in the gist above.

  1. Clone it.
  2. Run the tests.
  3. Break something and watch what the tests tell you.

That’s how it stuck for me, and how it’ll stick for you.

Happy coding!

0 views
Back to Blog

Related posts

Read more »

First Post: A Little Biography

Introduction Hello, my name is Jay. Growing up, I wanted to follow in my dad's footsteps and become an engineer—and I did, just not in the way I originally exp...