[Paper] From Separate Compilation to Sound Language Composition

Published: 5 days ago (February 3, 2026 at 12:38 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.03777v1

Overview

The paper From Separate Compilation to Sound Language Composition tackles a long‑standing pain point for language engineers: how to keep language extensions modular and safely compile them separately. By introducing nlgcheck, a static analysis plug‑in for the Neverlang language workbench, the authors show that you can catch attribute‑grammar errors at compile time without giving up the flexibility that modern language workbenches promise.

Key Contributions

nlgcheck tool – a data‑flow‑based static analyzer that verifies attribute accesses across independently compiled language modules.
Formal soundness proof – guarantees that any attribute‑related runtime error flagged by nlgcheck would indeed manifest during execution, eliminating false negatives.
Integration with Neverlang – demonstrates that the analysis can be added to an existing workbench without redesigning its core architecture.
Empirical validation – mutation‑testing on real Neverlang projects shows >95 % detection of injected attribute‑grammar bugs with negligible compile‑time overhead.
Preservation of separate compilation – developers can continue to ship language extensions as independent artifacts (e.g., Maven/Gradle dependencies) while still enjoying strong static guarantees.

Methodology

Problem modeling – The authors formalize the language extension problem as a composition of attribute grammars, where each extension may introduce new synthesized/inherited attributes.
Data‑flow analysis design – They construct a forward‑flow analysis that tracks the definition‑use chains of attributes across module boundaries, treating each Neverlang component as a separate control‑flow graph.
Implementation in Neverlang – nlgcheck plugs into Neverlang’s compilation pipeline, intercepting the generated AST and attribute maps, then running the analysis before code generation.
Mutation testing – To evaluate effectiveness, they automatically inject typical attribute‑grammar bugs (e.g., missing definitions, type mismatches) into several open‑source Neverlang projects and measure detection rates.
Performance measurement – Compilation times with and without nlgcheck are compared across a range of project sizes to assess practical impact.

Results & Findings

Metric	Without nlgcheck	With nlgcheck
Compilation time increase	—	+3 % on average (max 7 % on the largest benchmark)
Bug detection (mutations)	0 % (runtime failures only)	96 % (remaining 4 % were benign or unreachable)
False positives	N/A	<1 % (mostly due to overly aggressive dead‑code heuristics)
Developer effort	Manual debugging of runtime attribute errors	Early compile‑time feedback, reducing debugging cycles by ~40 % (qualitative survey)

The data show that nlgcheck catches almost all attribute‑related errors before code is generated, while adding only a tiny compile‑time penalty—well within the tolerance of typical CI pipelines.

Practical Implications

Safer language extensions – Teams can publish language plugins (e.g., DSLs, custom syntax) as separate Maven/Gradle artifacts without fearing hidden attribute bugs that only surface at runtime.
Improved CI/CD – nlgcheck can be run as a lightweight step in continuous integration, turning what used to be a flaky runtime test into a deterministic compile‑time check.
Easier ecosystem integration – Because the analysis respects separate compilation, existing dependency managers (npm, pip, Cargo) can treat language extensions like any other library, simplifying versioning and reproducibility.
Reduced maintenance cost – Early detection of attribute mismatches cuts down on “hard‑to‑reproduce” bugs that often arise when multiple teams extend a base language independently.
Potential for other workbenches – The data‑flow approach is language‑agnostic; similar static checks could be added to other modular language workbenches (e.g., Spoofax, Rascal) to bring the same guarantees.

Limitations & Future Work

Scope limited to attribute grammars – nlgcheck currently focuses on attribute definition/use; other semantic checks (e.g., type‑system extensions, name‑resolution policies) remain out of scope.
Assumes well‑formed Neverlang modules – The analysis presumes that each module’s internal grammar is already sound; malformed modules can still cause the analyzer to abort.
Scalability to massive ecosystems – While the overhead is modest for the evaluated projects, the authors note that very large, highly interdependent language ecosystems may need incremental analysis or caching strategies.
Future directions include extending the framework to handle type‑level extensions, integrating incremental compilation to further shrink analysis time, and exploring cross‑workbench interoperability so that the same static guarantees can be offered across different language engineering platforms.

Authors

Federico Bruzzone
Walter Cazzola
Luca Favalli

Paper Information

arXiv ID: 2602.03777v1
Categories: cs.PL, cs.SE
Published: February 3, 2026
PDF: Download PDF

[Paper] From Separate Compilation to Sound Language Composition

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Characterizing and Modeling the GitHub Security Advisories Review Pipeline

[Paper] When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models

[Paper] Toward Quantum-Safe Software Engineering: A Vision for Post-Quantum Cryptography Migration

[Paper] A Bayesian Optimization-Based AutoML Framework for Non-Intrusive Load Monitoring