[Paper] Tunable Automation in Automated Program Verification

Published: 1 month ago (December 3, 2025 at 11:27 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.03926v1

Overview

The paper introduces a tunable automation mechanism for SMT‑based program verifiers. By letting developers control which quantified facts are available during verification, the approach lets you strike a balance between the “push‑button” convenience of heavy automation and the speed of a leaner, more manual proof style. Implemented in the Rust‑focused verifier Verus, the technique is evaluated on real‑world open‑source projects, showing how selective quantifier management can dramatically affect verification time without sacrificing correctness.

Key Contributions

Fine‑grained quantifier control: A language‑level construct that lets library authors expose multiple pre‑defined automation levels (e.g., “fast”, “balanced”, “full”).
User‑driven customization: End‑users can override the default level at the module, function, or even proof‑context granularity.
Integration with Verus: The mechanism is built into Verus’s Rust‑like syntax, preserving the tool’s ergonomics for systems programmers.
Empirical evaluation: Benchmarks on several publicly available Rust codebases demonstrate measurable trade‑offs between verification time and proof effort across the automation spectrum.
Design guidelines: The authors provide practical recommendations for library authors on how to expose useful automation tiers without overwhelming users.

Methodology

Quantifier Availability Model – The authors treat each quantified axiom (e.g., a lemma about a data structure) as a resource that can be turned on or off in a given verification context.
Automation Levels – Library code can declare groups of axioms under named levels (e.g., #[automation = "fast"]). The verifier’s engine then only loads the axioms belonging to the selected level.
Contextual Overrides – Using lightweight annotations (#[use_level = "full"]), developers can locally request more (or fewer) axioms for a specific function or proof block.
Implementation in Verus – The team extended Verus’s front‑end to parse these annotations and modified the underlying SMT driver to dynamically adjust the set of instantiated quantifiers before each proof query.
Evaluation Setup – They selected three Rust projects (a cryptographic library, a concurrent data‑structure library, and a systems‑level driver) and ran Verus with each automation level, measuring total verification time, number of time‑outs, and amount of manual proof hints required.

Results & Findings

Automation Level	Avg. Verification Time	# of Time‑outs	Manual Hints Needed
Fast (few quantifiers)	≈ 0.8× baseline	↑ 12 %	↑ 35 %
Balanced (default)	≈ 1.0× baseline	baseline	baseline
Full (all quantifiers)	≈ 1.6× baseline	↓ 8 %	↓ 22 %

Performance vs. effort trade‑off: The “fast” setting cuts verification time by up to 20 % but forces developers to add more explicit proof hints. The “full” setting eliminates many time‑outs at the cost of a noticeable slowdown.
Selective overrides pay off: Applying the “full” level only to a handful of hot‑spot functions recovered most of the robustness benefits while keeping overall runtime close to the “balanced” baseline.
Developer experience: Library authors reported that exposing automation tiers reduced the number of support tickets from downstream users who previously struggled with verification time‑outs.

Practical Implications

Library design: When publishing verified Rust crates, you can ship multiple automation profiles, letting downstream projects pick the right balance for their CI pipelines.
CI/CD integration: Teams can run fast verification on every pull request and switch to a more thorough level on nightly builds, catching subtle bugs without slowing down daily development.
Performance‑critical domains: In safety‑critical or high‑assurance systems (e.g., embedded firmware, cryptographic primitives), the ability to dial‑up automation only where needed can keep verification budgets manageable.
Tooling ecosystem: The concept is portable—other SMT‑based verifiers (e.g., Dafny, Why3) could adopt a similar quantifier‑availability API, fostering a more uniform approach to automation tuning across languages.

Limitations & Future Work

Quantifier granularity: The current model works at the level of whole axioms; finer‑grained control (e.g., per‑instantiation pattern) remains unexplored.
User ergonomics: While annotations are lightweight, developers still need to understand the performance impact of each level, which may require tooling support (e.g., visual dashboards).
Scalability to massive codebases: The evaluation covered medium‑sized projects; the authors note that extremely large monorepos could exhibit different scaling characteristics.
Cross‑tool validation: Future work includes prototyping the approach in other verification frameworks and studying how it interacts with alternative quantifier‑instantiation strategies (e.g., E‑matching, model‑based quantifier instantiation).

Bottom line: Tunable automation gives verification engineers a practical knob to balance speed and proof power, turning the “all‑or‑nothing” quantifier dilemma into a flexible, developer‑friendly workflow.

Authors

Alexander Y. Bai
Chris Hawblitzel
Andrea Lattuada

Paper Information

arXiv ID: 2512.03926v1
Categories: cs.SE, cs.LO, cs.PL
Published: December 3, 2025
PDF: Download PDF

[Paper] Tunable Automation in Automated Program Verification

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] MicroRacer: Detecting Concurrency Bugs for Cloud Service Systems

[Paper] Executing Discrete/Continuous Declarative Process Specifications via Complex Event Processing

[Paper] Compiling Away the Overhead of Race Detection

[Paper] Automated Code Review Assignments: An Alternative Perspective of Code Ownership on GitHub