GitHub Broke Git: The Merge Queue Bug That Silently Deleted Your Code

Published: (May 2, 2026 at 11:09 PM EDT)
8 min read
Source: Dev.to

Source: Dev.to

The Day GitHub Stopped Being Git

At 16:05 UTC on April 23 rd, 2026, a regression crept into GitHub’s merge queue. For the next three‑and‑a‑half hours, engineers around the world were reviewing pull requests, clicking “merge,” and watching everything look completely fine:

  • Green checks.
  • Clean diffs.
  • No warnings.

What was actually happening behind the scenes was quietly horrifying.

A PR with a perfectly reasonable +29 / -34 diff would get approved and queued. What landed on main was a commit worth +245 / -1,137. Thousands of lines of code that other engineers had already shipped, reviewed, and moved on from, just gone. And every merge that came after went in on top of that broken history.

The UI showed zero problems. The status page showed no outage. The platform was lying to everyone’s faces.

Git commit graph showing incorrect merge base

What Actually Went Wrong Under the Hood

GitHub’s merge queue works by creating a temporary branch for each PR in the queue. Normally, that temp branch starts from the tip of main plus the PR’s diff, CI runs against it, it passes, and it lands.

On April 23 rd, the queue started building those temp branches from the wrong starting point. Instead of branching from the current tip of main, it was branching from wherever the feature branch had originally diverged from main, potentially dozens or hundreds of commits back. Then it pushed the entire contents of that temp branch to main.

So if your feature branch was 50 commits behind main when it hit the queue, the “merge” silently removed those 50 commits of other people’s work as a side effect of landing yours. CI passed because the temp branch on its own was internally consistent. main blew up because the temp branch had nothing to do with the current state of main.

Root cause: a new code path that adjusted merge‑base computation was meant to be gated behind a feature flag for an unreleased feature. The gating was incomplete, the new behavior leaked into production, and it applied to all squash‑merge groups.

Three things made this bug particularly nasty

  1. The PR UI lied.
    You reviewed +29 / -34. The commit that landed was +245 / -1,137. The thing engineers approved was not the thing that merged. That breaks the most fundamental contract of a code‑review system.

  2. It was completely silent.
    No merge conflict. No failed check. No banner on the PR. Teams only found out when someone noticed code on main that should have been there simply was not.

  3. It scaled with repo activity.
    The faster a repo was merging, the further feature branches had drifted from main, and the more damage each bad merge did. The teams that relied most on the merge queue got hit the hardest.

The Human Cost

This was not a theoretical problem. Engineering teams spent entire afternoons in incident mode: combing through commit graphs, reconstructing deleted code by hand, coordinating recovery across multiple repos, and filing support tickets that would take days to get a response.

  • One organization reported that every single team running on GitHub’s merge queue got hit, with dozens of bad commits each and hundreds of existing commits clobbered before anyone noticed.
  • One company alone claimed to have experienced over 200 ruined PRs.

GitHub later said 2,092 pull requests across 230 repositories were affected during the impact window of April 22–23. Earlier messaging from GitHub’s COO on X put the number at 2,804 PRs, and some community members pushed back hard on both figures given what individual companies were experiencing.

The incident was not detected by GitHub’s automated monitoring because it affected merge‑commit correctness rather than availability. GitHub only became aware of the regression at 19:38 UTC, following an increase in customer‑support inquiries. The fix—a revert and force‑deploy—was complete by 20:43 UTC. Three hours and thirty‑three minutes of silent corruption.

Why the Status Page Was Useless

If you checked GitHub’s status page on April 23 rd, you probably saw nothing alarming. There was no major outage reported. No partial outage.

That is because GitHub’s status‑page calculus specifically excludes “Degraded Performance” from downtime numbers. The platform itself never went down. Developers could still push code, open PRs, and click merge. The fact that clicking merge was silently destroying their codebase did not register as an incident on the dashboard.

This is a telling gap. Uptime and correctness are not the same thing. A bank that processes your transactions but records them incorrectly is not “up.” GitHub processed the merges; it just produced wrong results. The status page was not built to catch that kind of failure.

This Was Not an Isolated Bad Day

It would be easier to move on from this if it were a one‑off. But April 2026 was a genuinely rough stretch for GitHub.

Four days after the merge‑queue incident, on April 27 th, GitHub’s Elasticsearch cluster became overloaded—likely from a botnet attack—and search‑backed UI surfaces stopped returning results. Pull‑request lists went blank. Issues disappeared… (the original text cuts off here).

GitHub Incident – April 2026

The Incident Timeline

  • April 22‑23, 2026 – A bug in the merge‑queue caused merge commits to be generated incorrectly. The UI showed a clean green merge, but the underlying history was corrupted.
  • April 28, 2026 – GitHub’s CTO, Vlad Fedorov, posted an apology about reliability. In the same morning a separate security disclosure was released: researchers at Wiz found a critical remote‑code‑execution vulnerability in GitHub’s git push pipeline (CVE‑2026‑3854, CVSS 8.7). A crafted git push could execute unsandboxed code on GitHub’s servers. The fix was deployed in 75 minutes.

Three major failures in five days – merge‑queue correctness, search collapse, and an RCE in the core git push path.

Scale Pressures

Fedorov explained that GitHub had originally planned a 10× capacity increase for October 2025. By February 2026, the surge of AI‑driven development tools (Copilot, Cursor, Codex) forced a 30× redesign. Current peaks:

  • 90 million merged PRs per day
  • 1.4 billion commits per day

Developer incident response

The Deeper Architectural Problem

GitHub’s merge queue builds merge commits via a different code path than a manual “Merge pull request”. This duplication creates two places where behavior can diverge silently.

  • Delegation risk – The queue automates what a human would do, but once it adds its own logic, it can produce commits that no one wrote or approved.
  • Pattern beyond GitHub – Any automated system (queues, bots, AI agents) with write access can introduce invisible failure modes when it does something a human would not.

Lesson:
Do not avoid merge queues, but ensure that anything writing to main stays as close as possible to boring, well‑understood Git operations, without novel logic that reviewers cannot audit.

Will Anyone Actually Leave?

Short answer: Probably not in any significant numbers.

  • Stickiness: CI pipelines, webhooks, RBAC policies, Actions workflows, third‑party app permissions, team structures, and PR history make migration a multi‑month effort.
  • Utility mindset: GitHub is the default platform for open source and most integrations. Developers rarely switch utilities because of a bad week.

What should change:
The baseline of trust. Infrastructure that silently corrupts data, even briefly, requires a solid recovery plan.

Immediate actions for teams:

  1. Verify – Audit squash merges in merge‑queue groups (≥ 2 PRs) from the April 22‑23 window.
  2. Document assumptions – List parts of your build/deploy pipeline that assume Git history is correct and make those assumptions visible for review.

What GitHub Says It Is Doing

GitHub’s post‑incident response includes concrete commitments:

  • Expand test coverage for merge‑correctness validation.
  • Add regression checks that validate resulting Git contents across supported merge configurations before production.
  • Migrate performance‑sensitive code from the older Ruby codebase to Go.
  • Move systems to public‑cloud infrastructure to meet the 30× scale requirement.

The April 23 bug stemmed from incomplete feature‑flagging on a new code path; the immediate fix was a revert, and the longer‑term fix is richer test coverage for multi‑PR merge‑queue groups.

The Takeaway

On April 23, 2026, GitHub’s merge queue broke the core contract of version control: what you approve is what merges. It did so silently, with a clean UI, no errors, and no status‑page entry.

  • The code remained in Git object storage, but the branch history was wrong.
  • No automated system could safely repair every affected repository; engineers had to intervene manually.

Bottom line: Git is meant to be the boring, reliable layer that everything else builds on. When that layer becomes “interesting,” it does so in the worst possible way.


If you found this useful, drop a comment below or follow for more deep dives into the tools we trust (sometimes too much).

0 views
Back to Blog

Related posts

Read more »