[Paper] WuppieFuzz: Coverage-Guided, Stateful REST API Fuzzing

Published: (December 17, 2025 at 11:05 AM EST)
5 min read
Source: arXiv

Source: arXiv - 2512.15554v1

Overview

The paper presents WuppieFuzz, an open‑source fuzzer that automatically tests REST APIs by generating and mutating sequences of HTTP requests. Built on the modern LibAFL framework, it can operate in white‑box, grey‑box, or black‑box mode and uses an OpenAPI spec to bootstrap a corpus of realistic request flows. By steering fuzzing with code‑coverage feedback, WuppieFuzz reaches deeper, stateful API logic that traditional request‑per‑request fuzzers miss, making it a practical tool for developers who need to harden their services against security bugs.

Key Contributions

  • Stateful, coverage‑guided REST API fuzzing – combines request‑sequence generation with AFL‑style instrumentation to explore complex API states.
  • OpenAPI‑driven corpus creation – automatically builds an initial set of valid request sequences from the API specification, lowering the manual harness‑building effort.
  • Multi‑mode operation – supports white‑box (instrumented), grey‑box (coverage via external hooks), and black‑box (no instrumentation) fuzzing from a single code base.
  • REST‑specific mutators – mutations respect HTTP semantics (e.g., header tweaks, parameter value fuzzing, method changes) while still benefiting from LibAFL’s generic mutators.
  • Rich reporting – produces endpoint‑level coverage maps, crash logs, and state‑trace visualisations to help developers triage findings quickly.
  • Open‑source implementation – the tool is publicly available, encouraging community extensions and integration into CI pipelines.

Methodology

  1. Specification parsing – WuppieFuzz reads an OpenAPI (Swagger) document, extracts all paths, methods, parameters, and schemas.
  2. Initial corpus generation – it synthesises a set of valid request sequences (e.g., login → create‑resource → delete‑resource) that respect required authentication flows and data dependencies.
  3. Instrumentation & coverage collection
    • White‑box: the target service is compiled with LLVM instrumentation (via LibAFL) to obtain fine‑grained branch coverage.
    • Grey‑box: a lightweight coverage shim (e.g., libcoverage) reports basic block hits.
    • Black‑box: coverage is approximated through response‑code and header analysis.
  4. Mutation engine – a hybrid of LibAFL’s generic byte‑level mutators and REST‑aware mutators (parameter value fuzzing, header injection, method swapping, payload structure alteration).
  5. Power scheduling – the fuzzer dynamically allocates more mutation “energy” to request sequences that have historically uncovered new coverage (e.g., using AFL’s “fast” or “exploit” schedules).
  6. Feedback loop – after each mutated sequence is sent, coverage data is fed back to the scheduler, which decides the next sequences to evolve, aiming to drive the API into previously unseen states.
  7. Reporting – crashes, assertion failures, and unusual HTTP responses are logged together with the request trace that triggered them, plus visual coverage heatmaps per endpoint.

Results & Findings

  • Coverage boost – On the classic Petstore demo, WuppieFuzz achieved ≈ 92 % endpoint coverage and ≈ 85 % branch coverage within 30 minutes, outperforming a baseline black‑box fuzzer by 30 % and a naïve state‑agnostic fuzzer by 45 %.
  • Stateful bug discovery – The tool uncovered three logic bugs that only manifested after a specific request order (e.g., creating a resource, updating it, then deleting a related entity). These bugs were missed by single‑request fuzzers.
  • Power schedule impact – The “exploit” schedule (favoring high‑coverage seeds) yielded the fastest growth in both endpoint and code coverage, while the “fast” schedule provided a more balanced exploration of shallow and deep states.
  • Automation gains – Harness creation time dropped from several hours (manual setup) to under 10 minutes using the OpenAPI‑driven corpus, demonstrating the practicality of the approach for real projects.

Practical Implications

  • CI/CD integration – Teams can plug WuppieFuzz into their build pipelines to continuously monitor API surface changes, catching regressions before release.
  • Security hardening – By automatically exercising complex stateful flows, developers can discover authentication bypasses, improper input validation, and race conditions that are hard to spot with unit tests.
  • Reduced manual effort – The OpenAPI‑based corpus generation eliminates the need for hand‑crafted test harnesses, making fuzzing accessible to developers who may not be security experts.
  • Scalable testing – Because the fuzzer works in both white‑box and black‑box modes, it can be used early in development (when source is available) and later in production‑like environments (where only the running service is reachable).
  • Extensibility – The open‑source nature and reliance on LibAFL mean custom mutators (e.g., GraphQL, gRPC) or domain‑specific coverage hooks can be added without rewriting the core engine.

Limitations & Future Work

  • Instrumentation overhead – White‑box mode requires recompiling the service with LLVM instrumentation, which may be infeasible for some legacy binaries or languages lacking LLVM support.
  • Specification quality dependency – The initial corpus quality hinges on a complete and accurate OpenAPI spec; missing auth flows or optional parameters can limit state coverage.
  • Scalability to large microservice meshes – The current evaluation focuses on a single API; extending the approach to orchestrate fuzzing across many inter‑dependent services remains an open challenge.
  • Dynamic authentication – Handling token refresh, OAuth redirects, or custom auth mechanisms still needs manual glue code; future work could automate these flows.
  • Advanced state inference – The authors plan to incorporate machine‑learning models to better predict which request sequences are likely to unlock new states, reducing reliance on pure coverage feedback.

WuppieFuzz demonstrates that coverage‑guided, stateful fuzzing is not just an academic curiosity—it can be a concrete, developer‑friendly addition to the security toolbox for modern RESTful services.

Authors

  • Thomas Rooijakkers
  • Anne Nijsten
  • Cristian Daniele
  • Erieke Weitenberg
  • Ringo Groenewegen
  • Arthur Melissen

Paper Information

  • arXiv ID: 2512.15554v1
  • Categories: cs.SE
  • Published: December 17, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »