[Paper] From Separate Compilation to Sound Language Composition

Published: (February 3, 2026 at 12:38 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.03777v1

Overview

The paper From Separate Compilation to Sound Language Composition tackles a long‑standing pain point for language engineers: how to keep language extensions modular and safely compile them separately. By introducing nlgcheck, a static analysis plug‑in for the Neverlang language workbench, the authors show that you can catch attribute‑grammar errors at compile time without giving up the flexibility that modern language workbenches promise.

Key Contributions

  • nlgcheck tool – a data‑flow‑based static analyzer that verifies attribute accesses across independently compiled language modules.
  • Formal soundness proof – guarantees that any attribute‑related runtime error flagged by nlgcheck would indeed manifest during execution, eliminating false negatives.
  • Integration with Neverlang – demonstrates that the analysis can be added to an existing workbench without redesigning its core architecture.
  • Empirical validation – mutation‑testing on real Neverlang projects shows >95 % detection of injected attribute‑grammar bugs with negligible compile‑time overhead.
  • Preservation of separate compilation – developers can continue to ship language extensions as independent artifacts (e.g., Maven/Gradle dependencies) while still enjoying strong static guarantees.

Methodology

  1. Problem modeling – The authors formalize the language extension problem as a composition of attribute grammars, where each extension may introduce new synthesized/inherited attributes.
  2. Data‑flow analysis design – They construct a forward‑flow analysis that tracks the definition‑use chains of attributes across module boundaries, treating each Neverlang component as a separate control‑flow graph.
  3. Implementation in Neverlang – nlgcheck plugs into Neverlang’s compilation pipeline, intercepting the generated AST and attribute maps, then running the analysis before code generation.
  4. Mutation testing – To evaluate effectiveness, they automatically inject typical attribute‑grammar bugs (e.g., missing definitions, type mismatches) into several open‑source Neverlang projects and measure detection rates.
  5. Performance measurement – Compilation times with and without nlgcheck are compared across a range of project sizes to assess practical impact.

Results & Findings

MetricWithout nlgcheckWith nlgcheck
Compilation time increase+3 % on average (max 7 % on the largest benchmark)
Bug detection (mutations)0 % (runtime failures only)96 % (remaining 4 % were benign or unreachable)
False positivesN/A<1 % (mostly due to overly aggressive dead‑code heuristics)
Developer effortManual debugging of runtime attribute errorsEarly compile‑time feedback, reducing debugging cycles by ~40 % (qualitative survey)

The data show that nlgcheck catches almost all attribute‑related errors before code is generated, while adding only a tiny compile‑time penalty—well within the tolerance of typical CI pipelines.

Practical Implications

  • Safer language extensions – Teams can publish language plugins (e.g., DSLs, custom syntax) as separate Maven/Gradle artifacts without fearing hidden attribute bugs that only surface at runtime.
  • Improved CI/CD – nlgcheck can be run as a lightweight step in continuous integration, turning what used to be a flaky runtime test into a deterministic compile‑time check.
  • Easier ecosystem integration – Because the analysis respects separate compilation, existing dependency managers (npm, pip, Cargo) can treat language extensions like any other library, simplifying versioning and reproducibility.
  • Reduced maintenance cost – Early detection of attribute mismatches cuts down on “hard‑to‑reproduce” bugs that often arise when multiple teams extend a base language independently.
  • Potential for other workbenches – The data‑flow approach is language‑agnostic; similar static checks could be added to other modular language workbenches (e.g., Spoofax, Rascal) to bring the same guarantees.

Limitations & Future Work

  • Scope limited to attribute grammars – nlgcheck currently focuses on attribute definition/use; other semantic checks (e.g., type‑system extensions, name‑resolution policies) remain out of scope.
  • Assumes well‑formed Neverlang modules – The analysis presumes that each module’s internal grammar is already sound; malformed modules can still cause the analyzer to abort.
  • Scalability to massive ecosystems – While the overhead is modest for the evaluated projects, the authors note that very large, highly interdependent language ecosystems may need incremental analysis or caching strategies.
  • Future directions include extending the framework to handle type‑level extensions, integrating incremental compilation to further shrink analysis time, and exploring cross‑workbench interoperability so that the same static guarantees can be offered across different language engineering platforms.

Authors

  • Federico Bruzzone
  • Walter Cazzola
  • Luca Favalli

Paper Information

  • arXiv ID: 2602.03777v1
  • Categories: cs.PL, cs.SE
  • Published: February 3, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »