[Paper] Peeling Off the Cocoon: Unveiling Suppressed Golden Seeds for Mutational Greybox Fuzzing
Source: arXiv - 2602.23736v1
Overview
The paper introduces PoCo, a novel technique that “peels off” obstacle‑conditional statements to expose hidden, high‑value seeds that traditional coverage‑based seed selection (CSS) tools like afl‑cmin often miss. By temporarily disabling these blockers, PoCo enables deeper exploration of a program’s state space, leading to richer seed pools and more effective grey‑box fuzzing.
Key Contributions
- Obstacle‑Conditional Identification: A lightweight static‑dynamic analysis that pinpoints conditionals that act as “cocoons”, preventing the fuzzing engine from reaching downstream code.
- Seed‑Peeling Mechanism: An instrumentation scheme that temporarily forces obstacle conditionals to evaluate both true and false, allowing the fuzzer to collect seeds that would otherwise be suppressed.
- Enhanced CSS Pipeline: Integration of PoCo with existing CSS tools (e.g.,
afl‑cmin,afl‑tmin) to produce a smaller yet more powerful seed set without sacrificing coverage. - Empirical Validation: Experiments on 12 real‑world open‑source projects show up to 23 % increase in edge coverage and 2–5× more unique crashes compared with vanilla CSS.
- Open‑Source Prototype: The authors release PoCo as a plug‑in for AFL‑2.52b, making it easy for practitioners to try it out.
Methodology
- Static Scan: PoCo parses the target binary’s control‑flow graph (CFG) to locate obstacle conditionals—branches whose true/false outcomes gate large downstream sub‑graphs but are rarely exercised by the initial seed corpus.
- Dynamic Profiling: While the program runs under a baseline fuzzer, PoCo records which conditionals are never taken or are taken only with a single outcome.
- Conditional Peeling: For each identified obstacle, PoCo injects a tiny instrumentation stub that forces the branch to evaluate both ways on successive runs (e.g., by toggling a global flag). This does not alter the program’s logic permanently; the original condition is restored after seed collection.
- Deep Seed Selection: The fuzzer is rerun on the instrumented binary. Because the blockers are temporarily lifted, the fuzzer can generate inputs that reach deeper code paths. These inputs are fed to a standard CSS tool (
afl‑cmin) to prune redundancies while preserving the newly discovered coverage. - Mapping Back: The final seed set is validated against the original (uninstrumented) binary to ensure that each seed still triggers the intended paths without the artificial overrides.
The whole pipeline is fully automated and adds only a modest runtime overhead (≈ 5 % on average) because the instrumentation is lightweight and only active during the seed‑generation phase.
Results & Findings
| Benchmark | Baseline (afl‑cmin) | PoCo‑augmented | Coverage ↑ | New Crashes |
|---|---|---|---|---|
| libpng | 12 800 edges | 15 800 edges | +23 % | +3 |
| openssl | 9 450 edges | 11 200 edges | +18 % | +2 |
| sqlite3 | 14 300 edges | 16 100 edges | +13 % | +1 |
| … | … | … | … | … |
- Seed Set Size: PoCo reduces the final seed corpus by ~30 % compared with the baseline while increasing coverage, meaning less storage and faster fuzzing cycles.
- Bug Discovery: In a 24‑hour fuzzing window, PoCo‑enhanced runs uncovered 7 previously unknown CVEs across the benchmark suite, all triggered by seeds that were absent from the vanilla CSS output.
- Performance Overhead: The instrumentation phase adds ~5 % CPU overhead, but the downstream fuzzing speed improves by ~12 % thanks to the smaller, higher‑quality seed pool.
Practical Implications
- Faster Time‑to‑Bug: Development teams can integrate PoCo into CI pipelines to automatically “unlock” deeper code paths early, shortening the feedback loop for security testing.
- Resource Efficiency: Smaller seed sets mean less disk I/O and memory pressure for fuzzing clusters, translating to cost savings in large‑scale fuzzing farms.
- Better Coverage for Hardened Code: Applications that perform heavy input validation (e.g., parsers, crypto libraries) often stall fuzzers at early checks; PoCo’s conditional peeling circumvents these roadblocks without needing heavyweight symbolic execution.
- Plug‑and‑Play: Since PoCo works as a thin wrapper around AFL’s existing tools, teams can adopt it without rewriting their fuzzing harnesses or retraining engineers.
Limitations & Future Work
- Data‑Dependent Obstacles: PoCo currently focuses on syntactic conditionals; branches whose outcomes depend on complex data structures (e.g., checksums, cryptographic hashes) may still block deeper exploration.
- Potential False Positives: Forcing a branch both ways can generate seeds that only succeed under the artificial override, requiring the final validation step to filter them out.
- Scalability of Static Analysis: Extremely large binaries (e.g., monolithic browsers) may cause the CFG extraction to become a bottleneck; incremental or sampling‑based analysis is a possible remedy.
- Future Directions: The authors plan to (1) combine PoCo with lightweight symbolic execution to handle data‑dependent obstacles, (2) extend support to other fuzzers (e.g., libFuzzer, Honggfuzz), and (3) explore automated prioritization of which obstacles to peel based on historical bug‑finding ROI.
Authors
- Ruixiang Qian
- Chunrong Fang
- Zengxu Chen
- Youxin Fu
- Zhenyu Chen
Paper Information
- arXiv ID: 2602.23736v1
- Categories: cs.SE
- Published: February 27, 2026
- PDF: Download PDF