[Paper] Same Engine, Multiple Gears: Parallelizing Fixpoint Iteration at Different Granularities (Extended Version)
Source: arXiv - 2602.06680v1
Overview
The paper “Same Engine, Multiple Gears: Parallelizing Fixpoint Iteration at Different Granularities” tackles a long‑standing bottleneck in static analysis: the fixpoint computation that underlies most analyzers. By making the fixpoint engine granularity‑agnostic, the authors show how the same solver can be run in a “low‑gear” mode (few, coarse tasks) or a “high‑gear” mode (many fine‑grained tasks), adapting to the hardware and the size of the code base. Their experiments with the Goblint analyzer on large, real‑world projects demonstrate substantial speed‑ups without sacrificing analysis precision.
Key Contributions
- Granularity‑parametric fixpoint engine – a single solver that can be configured to run with tasks of any size, from whole program threads down to individual statements.
- Two parallelization philosophies:
- Immediate approach – all workers share a single thread‑safe hash table that holds the global solver state.
- Independent approach – each worker maintains its own local state and synchronizes via a publish/subscribe data structure.
- Integration with Goblint – the authors implemented both philosophies inside the open‑source static analysis framework, providing a concrete reference implementation.
- Comprehensive empirical evaluation – performance measurements on several large open‑source projects (e.g., the Linux kernel, PostgreSQL) showing up to 5× speed‑up on a 32‑core machine.
- Design guidelines for choosing the appropriate granularity and parallelization strategy based on program characteristics and hardware resources.
Methodology
- Base Solver (TD) – The authors start from Goblint’s top‑down (TD) fixpoint solver, which already supports mixed flow‑sensitivity (i.e., it can analyze both intra‑procedural and inter‑procedural information).
- Task Pool Abstraction – A fixed pool of worker threads pulls tasks from a shared queue. The granularity of a task is a configurable parameter (e.g., “analyze one function”, “analyze one basic block”).
- Immediate Parallelism
- All workers read/write a single concurrent hash map that stores abstract states for program locations.
- Synchronization is achieved via lock‑free atomic operations; the map guarantees linearizability, so workers always see a consistent view of the global state.
- Independent Parallelism
- Each worker runs a local copy of the solver state.
- When a worker discovers a new abstract fact that other workers might need, it publishes the fact to a topic‑based subscription system.
- Workers subscribe to topics (e.g., “facts about function f”) and pull updates asynchronously.
- Evaluation Setup – The authors compiled a benchmark suite of 12 sizable C projects (total > 30 MLOC). They measured wall‑clock time, CPU utilization, and memory overhead for several configurations: varying core counts (1–32), task granularities (coarse vs. fine), and the two parallelization philosophies.
Results & Findings
| Configuration | Speed‑up (32 cores) | Memory overhead | Observations |
|---|---|---|---|
| Immediate, coarse tasks (per‑thread) | 2.1× | +12 % | Low contention, but limited parallelism for many‑core machines. |
| Immediate, fine tasks (per‑basic‑block) | 4.7× | +18 % | High parallelism, but hash‑map contention spikes for very large programs. |
| Independent, coarse tasks | 2.8× | +22 % | Extra memory for duplicated states, but less contention. |
| Independent, fine tasks | 5.0× | +35 % | Best wall‑clock time; publish/subscribe efficiently spreads updates. |
| Sequential baseline | 1.0× | — | Serves as reference. |
Key takeaways
- Granularity matters: fine‑grained tasks unlock more cores but increase synchronization overhead.
- Independent approach scales better on many‑core machines because it avoids a single shared hash table bottleneck.
- Memory trade‑off: duplicating state per worker is acceptable for typical analysis workloads (the authors report < 2 GB extra for a 30 MLOC code base).
- No loss of precision: both parallelizations produce exactly the same abstract results as the sequential solver.
Practical Implications
- Faster CI pipelines – Projects that already run Goblint (or similar analyzers) as part of continuous integration can cut analysis time from hours to minutes on modern multi‑core build agents.
- Scalable cloud analysis services – Cloud‑based static analysis platforms can dynamically adjust task granularity based on the number of allocated CPUs, achieving near‑linear speed‑up without rewriting the analysis logic.
- Developer tooling – IDE plugins that perform on‑the‑fly analysis (e.g., for detecting data races) can now run more responsive background checks on typical developer laptops equipped with 8‑core CPUs.
- Portability – Because the parallelization is built on a generic task‑pool abstraction, the same engine can be reused across different static analysis frameworks (e.g., Infer, CodeQL) with minimal changes.
Limitations & Future Work
- Memory consumption grows with the independent approach; extremely large code bases may hit RAM limits on modest machines.
- Task granularity selection is currently manual; an adaptive scheduler that automatically tunes granularity at runtime would further simplify usage.
- Publish/subscribe overhead could become a bottleneck for analyses that generate a huge number of tiny facts; exploring more compact diff‑based synchronization is a promising direction.
- Evaluation scope is limited to C programs; applying the same ideas to other languages (Java, Rust) and to analyses that rely on heavy symbolic reasoning remains open.
Bottom line: By decoupling the fixpoint engine from a fixed task size and offering two complementary parallelization strategies, the authors give static analysis practitioners a practical way to harness modern many‑core hardware, turning what used to be a long‑running “single‑gear” process into a flexible, high‑performance engine.
Authors
- Ali Rasim Kocal
- Michael Schwarz
- Simmo Saan
- Helmut Seidl
Paper Information
- arXiv ID: 2602.06680v1
- Categories: cs.PL, cs.DC
- Published: February 6, 2026
- PDF: Download PDF