The Dangers of SSL Certificates

Published: (December 27, 2025 at 05:41 PM EST)
2 min read

Source: Hacker News

Bazel SSL Certificate Expiration Incident

Yesterday, the Bazel team at Google did not have a very Merry Boxing Day. An SSL certificate expired for and , as shown in this screenshot from the GitHub issue.

Screenshot of expired certificate

Impact on Users

The expired certificate broke the build workflow of users who use Bazel, who were faced with the following error message:

ERROR: Error computing the main repository mapping: Error accessing registry https://bcr.bazel.build/: Failed to fetch registry file https://bcr.bazel.build/modules/platforms/0.0.7/MODULE.bazel: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed

Incident Summary

After mitigation, Xùdōng Yáng provided a brief summary of the incident on the GitHub ticket:

Summary screenshot

it was an auto‑renewal being bricked due to some new subdomain additions, and the renewal failures didn’t send notifications for whatever reason.

Why SSL Certificates Are Dangerous

Say the words “expired SSL certificate” to any senior software engineer and watch the expression on their face. Everybody in this industry has been bitten by expired certs, including people who work at organizations that use automated certificate renewal. In fact, this case is an example of an automated certificate renewal system that failed!

  • SSL certificates are a fundamentally dangerous technology.
  • Operational experience is limited until something goes wrong.
  • When a failure occurs, teams often start from scratch to diagnose and fix it.
  • In this incident, Bazel team members who were very unfamiliar with this whole area had to scramble to read documentation and secure permissions.

Even if the team has local SSL certificate expertise, those members were out of the office because of the holiday. With an automated “set‑it‑and‑forget‑it” solution, knowledge does not spread across the team because it just works—until it stops working.

Failure Mode Characteristics

  • The failure mode is the opposite of graceful degradation.
  • One minute everything works fine; the next minute every HTTP request fails.
  • There is no natural signal to operators that the SSL certificate is approaching expiry.
  • No staging of the change is possible because the change is time‑based and affects all users simultaneously.

In other words, SSL certs have an expected failure mode (expiration) that maximizes blast radius—a hard failure for 100 % of users—without any natural feedback to operators that the system is at imminent risk of critical failure. With automated cert renewal, the likelihood increases that responders will not have experience with renewing certificates.

Is it any wonder that these keep biting us?

Back to Blog

Related posts

Read more »