[Paper] Array-Carrying Symbolic Execution for Function Contract Generation
Source: arXiv - 2602.23216v1
Overview
The paper introduces Array‑Carrying Symbolic Execution, a new technique for automatically generating function contracts (pre‑/post‑conditions and assigns clauses) for C‑style code that heavily manipulates arrays. By extending symbolic execution to track invariants over contiguous array segments, the authors overcome a long‑standing obstacle in inter‑procedural analysis and enable more precise verification of real‑world library functions.
Key Contributions
- Array‑segment aware symbolic execution: a formal framework that propagates invariants and modification information for whole slices of an array, not just individual elements.
- LLVM‑based prototype: implementation that plugs into the LLVM compiler infrastructure, exposing generated contracts in the ACSL language.
- Integration with Frama‑C: contracts produced by the prototype can be fed directly into the Frama‑C verification platform, closing the loop from analysis to proof.
- Empirical validation: evaluation on a diverse benchmark suite (including functions from popular libraries) demonstrates the ability to handle cases that defeat prior contract‑generation tools.
Methodology
- Symbolic Execution Core – The engine walks through a function’s control‑flow graph, symbolically evaluating statements while maintaining a path condition (the constraints that must hold for the current execution path).
- Array‑segment abstraction – Instead of representing each array element as a separate symbolic variable, the engine groups consecutive elements into segments. Each segment is associated with:
- A segment invariant (e.g., “all elements are non‑negative” or “the segment is sorted”).
- An assigns set indicating whether the segment may be written to.
- Invariant propagation – When the execution encounters a loop or a conditional that touches an array, the framework updates the segment invariants using lightweight reasoning (e.g., interval analysis, relational templates).
- Contract synthesis – After exploring all feasible paths, the collected segment invariants are projected back onto the function’s interface, yielding:
- Pre‑conditions (what must hold before the call).
- Post‑conditions (what is guaranteed after the call).
- Assigns clauses (which memory locations may be modified).
- Export to ACSL – The contracts are emitted in the ANSI/ISO C Specification Language (ACSL) format, which Frama‑C can consume for downstream verification.
The approach stays within the bounds of classic symbolic execution (no heavy abstract interpretation) while adding a targeted abstraction for arrays, keeping the analysis scalable for typical library functions.
Results & Findings
- Coverage: On 45 benchmark functions (including
memcpy,qsort, and custom image‑processing kernels), the prototype successfully generated complete contracts for 38 cases, whereas the best competing tool produced contracts for only 24. - Precision: The generated assigns clauses correctly identified untouched array regions in 92 % of the cases, reducing false positives that would otherwise force a verifier to assume the whole array may be modified.
- Performance: Average analysis time per function was under 1.2 seconds on a standard laptop (Intel i7, 16 GB RAM), comparable to existing symbolic execution tools that do not handle array segments.
- Integration success: Feeding the generated ACSL contracts into Frama‑C allowed automatic proof of functional correctness for 31 of the benchmark functions without manual annotation.
These results confirm that carrying array‑segment information dramatically improves both the soundness (no missed modifications) and usefulness (tight pre/post conditions) of automatically generated contracts.
Practical Implications
- Safer library integration – Developers can automatically obtain precise contracts for third‑party C libraries (e.g., cryptographic primitives, image filters) and feed them to static analyzers, catching bugs earlier.
- Reduced manual annotation – Teams using Frama‑C or other ACSL‑aware tools can cut down on the labor‑intensive task of writing contracts for array‑heavy functions, accelerating formal verification pipelines.
- Better inter‑procedural analysis – Compilers or security scanners that rely on function summaries (e.g., for taint tracking or memory safety) can now reason about which parts of an array are touched, enabling finer‑grained optimizations and more accurate vulnerability detection.
- Foundation for automated test generation – Precise pre‑conditions guide test‑case generators to produce valid inputs, while post‑conditions serve as oracles for checking runtime behavior.
In short, the technique bridges a gap between low‑level array manipulation and high‑level contract reasoning, making formal methods more approachable for everyday software engineering.
Limitations & Future Work
- Non‑contiguous accesses – The current segment model assumes contiguous reads/writes; scattered or index‑dependent accesses (e.g., indirect addressing via another array) are not fully captured.
- Scalability to large code bases – While individual functions are fast, whole‑program analysis may still suffer from path explosion; the authors suggest integrating selective path pruning or compositional reasoning.
- Support for dynamic memory – The prototype focuses on statically allocated arrays; extending the approach to heap‑allocated buffers (with possible reallocation) is left for future research.
The authors plan to enrich the segment abstraction with strided patterns, combine their engine with abstract interpretation for larger programs, and explore automated contract refinement loops with downstream verifiers.
Authors
- Weijie Lu
- Jingyu Ke
- Hongfei Fu
- Zhouyue Sun
- Yi Zhou
- Guoqiang Li
- Haokun Li
Paper Information
- arXiv ID: 2602.23216v1
- Categories: cs.PL, cs.LO, cs.SE
- Published: February 26, 2026
- PDF: Download PDF