[Paper] Package-Aware Approach for Repository-Level Code Completion in Pharo

Published: (January 9, 2026 at 03:19 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.05617v1

Overview

The paper introduces a package‑aware heuristic for Pharo’s repository‑level code completion engine. By taking the package structure of a project into account, the new approach surfaces more relevant symbols (classes, globals, variables) during typing, improving the developer experience over the existing flat‑global completion strategy.

Key Contributions

  • Package‑first search strategy: Prioritizes symbols from the same package, then other packages in the repository, and finally the global namespace.
  • Integration with Pharo’s lazy completion architecture: Extends the existing semantic heuristics without disrupting the modular fetcher design.
  • Empirical evaluation: Shows a measurable boost in Mean Reciprocal Rank (MRR) compared to the default heuristic and a naïve global‑only baseline.
  • Open‑source prototype: Implemented as a drop‑in replacement for Pharo’s completion engine, enabling immediate experimentation by the community.

Methodology

  1. Heuristic Design – The authors defined a three‑tiered lookup order:

    1. Local package: Symbols defined in the same package as the class where completion is invoked.
    2. Repository‑wide packages: Symbols from any other package within the same repository.
    3. Global namespace: All remaining globals (e.g., system libraries).
  2. Implementation – The heuristic was added as a new “fetcher” that plugs into Pharo’s existing lazy completion pipeline. It re‑uses the same ranking infrastructure, only altering the order in which candidate symbols are gathered.

  3. Evaluation Protocol

    • Dataset: A collection of real‑world Pharo projects (e.g., Seaside, Moose) covering a variety of package structures.
    • Metrics: Mean Reciprocal Rank (MRR) to capture how early the correct suggestion appears in the list.
    • Baselines: (a) the default semantic heuristic (flat global view) and (b) a simple global‑namespace query.
  4. Statistical Analysis – Paired t‑tests were used to verify that observed MRR improvements are statistically significant.

Results & Findings

HeuristicMRR (higher is better)
Default semantic (flat)0.42
Global‑only query0.38
Package‑aware (proposed)0.57
  • The package‑aware approach lifts the average rank of the correct completion by ~35 % over the default.
  • Improvements are most pronounced in large repositories where packages encapsulate cohesive functionality.
  • Developers receive the right class or variable earlier, reducing the number of keystrokes and context switches.

Practical Implications

  • Faster Development Cycles – By surfacing the most likely symbols first, developers spend less time hunting for the right name, especially in codebases with many packages.
  • Better Live‑Programming Experience – Pharo’s live‑coding environment benefits from more accurate completions, leading to fewer runtime errors caused by mistyped identifiers.
  • Scalable Tooling – The heuristic can be adopted by other Smalltalk‑derived IDEs or even ported to languages that expose a package/module hierarchy (e.g., Python, JavaScript).
  • Customizable Completion – Teams can tune the package priority (e.g., give higher weight to “core” packages) without rewriting the whole completion engine.

Limitations & Future Work

  • Repository Size Sensitivity – In very small projects the three‑tiered lookup may add negligible benefit, and the extra lookup step could introduce a slight latency.
  • Cross‑Repository Dependencies – The current design only considers packages within a single repository; handling external dependencies (e.g., via Metacello) remains an open challenge.
  • User‑Study Validation – The evaluation relies on offline metrics (MRR). A follow‑up user study would confirm the perceived productivity gains in real coding sessions.
  • Extending to Contextual Ranking – Future work could combine package awareness with additional signals (e.g., recent edits, call‑graph proximity) for even smarter suggestions.

Authors

  • Omar Abedelkader
  • Stéphane Ducasse
  • Oleksandr Zaitsev
  • Romain Robbes
  • Guillermo Polito

Paper Information

  • arXiv ID: 2601.05617v1
  • Categories: cs.SE
  • Published: January 9, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »