Making Parallel HTTP Requests Stable in Go: Lessons from Building a Markdown Linter
Source: Dev.to
When building gomarklint, a Go‑based Markdown linter, I faced a challenge: checking 100,000+ lines of documentation for broken links. Parallelizing this with goroutines seemed like a “no‑brainer,” but it immediately led to flaky tests in CI environments. Speed is easy in Go; stability is the real challenge. Below are the three patterns I implemented to achieve both.
URL Caching
In a large docset, the same URL appears dozens of times. A naïve concurrent approach sends a request for every occurrence, which can look like a DoS attack to the host. Using sync.Map, I implemented a simple URL cache to ensure each unique URL is checked only once.
var urlCache sync.Map
// Check if we've seen this URL before
if val, ok := urlCache.Load(url); ok {
return val.(*checkResult), nil
}
Limiting Concurrency
Even with a cache, checking many unique URLs simultaneously can exhaust local file descriptors or trigger rate limits. A semaphore channel caps the number of concurrent checks.
maxConcurrency := 10
sem := make(chan struct{}, maxConcurrency)
for _, url := range urls {
sem = 500 {
time.Sleep(retryDelay * time.Duration(attempt))
// retry...
}
Caching the Full Result
The most elusive bug was caching only the status code. If a request timed out, I stored status: 0. Subsequent checks retrieved 0 but didn’t know an error had occurred, leading to inconsistent logic. The fix is to cache the entire result, including the error.
type checkResult struct {
status int
err error
}
// Store the pointer to this struct in your cache
Dealing with Cache Stampedes
Even with the above measures, “cache stampedes” (multiple goroutines hitting the same uncached URL at the exact same millisecond) remain a concern. I’m currently exploring golang.org/x/sync/singleflight to solve this.
If you have experience tuning http.Client for massive parallel checks, I’d love to hear your thoughts in the comments or on GitHub!