LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days
Source: Schneier on Security
Opus 4.6’s Leap in Vulnerability Discovery
Opus 4.6 is notably better at finding high‑severity vulnerabilities than previous models—a sign of how quickly things are moving. Security teams have been automating vulnerability discovery for years, investing heavily in fuzzing infrastructure and custom harnesses to find bugs at scale. What stood out in early testing is how quickly Opus 4.6 found vulnerabilities out of the box without task‑specific tooling, custom scaffolding, or specialized prompting.
How Opus 4.6 Works Compared to Traditional Fuzzers
Fuzzers work by throwing massive amounts of random inputs at code to see what breaks. Opus 4.6 reads and reasons about code the way a human researcher would—looking at past fixes to find similar bugs that weren’t addressed, spotting patterns that tend to cause problems, or understanding a piece of logic well enough to know exactly what input would break it.
When we pointed Opus 4.6 at some of the most well‑tested codebases (projects that have had fuzzers running against them for years, accumulating millions of hours of CPU time), it uncovered high‑severity vulnerabilities, some of which had gone undetected for decades.
The details of how Claude Opus 4.6 found these zero‑days are discussed in the full blog post.