[Paper] An Empirical Comparison of General Context-Free Parsers

Published: (June 7, 2026 at 01:58 AM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.08465v1

Overview

Parsing underpins a vast range of software engineering tasks, from compilers and static analyzers to language servers and fuzz testing tools. Yet most parsers deployed in practice are deterministic (LL or LR), forcing developers not only to contort their grammars to fit the parser, but to simplify the very languages they design sacrificing expressiveness for the sake of parseability. General context-free parsers eliminate this constraint. Yet, despite decades of algorithmic development, no rigorous head-to-head comparison exists across the major families of parsing algorithms. We present the first unified, controlled benchmark of six generalized parsing algorithms: CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR, plus deterministic LL(1) and LR(1) baselines, all implemented in Rust with shared data structures and parse-tree extraction, and evaluated across 22 grammars ranging from simple expressions to full C++ and Java. Our results show that the cost of generality is lower than widely assumed. On deterministic grammars, the GLR family incurs only a 3x median slowdown over LR(1), with a narrow and predictable variance. GLR is the clear performance winner among generalized parsers and a practical default choice for software engineering tools.

Key Contributions

This paper presents research in the following areas:

  • cs.FL
  • cs.PF
  • cs.PL
  • cs.SE

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.FL.

Authors

  • Huan Vo
  • Danushka Liyanage
  • Hong Jin Kang
  • Sasha Rubin
  • Rahul Gopinath

Paper Information

  • arXiv ID: 2606.08465v1
  • Categories: cs.FL, cs.PF, cs.PL, cs.SE
  • Published: June 7, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »

[Paper] Making Software Meaningful

Adopting a single measure can improve the usability, modularity and accountability of software: a commitment to explicit meaning. This entails constructing and ...