[Paper] Package-Aware Approach for Repository-Level Code Completion in Pharo
Source: arXiv - 2601.05617v1
Overview
The paper introduces a package‑aware heuristic for Pharo’s repository‑level code completion engine. By taking the package structure of a project into account, the new approach surfaces more relevant symbols (classes, globals, variables) during typing, improving the developer experience over the existing flat‑global completion strategy.
Key Contributions
- Package‑first search strategy: Prioritizes symbols from the same package, then other packages in the repository, and finally the global namespace.
- Integration with Pharo’s lazy completion architecture: Extends the existing semantic heuristics without disrupting the modular fetcher design.
- Empirical evaluation: Shows a measurable boost in Mean Reciprocal Rank (MRR) compared to the default heuristic and a naïve global‑only baseline.
- Open‑source prototype: Implemented as a drop‑in replacement for Pharo’s completion engine, enabling immediate experimentation by the community.
Methodology
-
Heuristic Design – The authors defined a three‑tiered lookup order:
- Local package: Symbols defined in the same package as the class where completion is invoked.
- Repository‑wide packages: Symbols from any other package within the same repository.
- Global namespace: All remaining globals (e.g., system libraries).
-
Implementation – The heuristic was added as a new “fetcher” that plugs into Pharo’s existing lazy completion pipeline. It re‑uses the same ranking infrastructure, only altering the order in which candidate symbols are gathered.
-
Evaluation Protocol –
- Dataset: A collection of real‑world Pharo projects (e.g., Seaside, Moose) covering a variety of package structures.
- Metrics: Mean Reciprocal Rank (MRR) to capture how early the correct suggestion appears in the list.
- Baselines: (a) the default semantic heuristic (flat global view) and (b) a simple global‑namespace query.
-
Statistical Analysis – Paired t‑tests were used to verify that observed MRR improvements are statistically significant.
Results & Findings
| Heuristic | MRR (higher is better) |
|---|---|
| Default semantic (flat) | 0.42 |
| Global‑only query | 0.38 |
| Package‑aware (proposed) | 0.57 |
- The package‑aware approach lifts the average rank of the correct completion by ~35 % over the default.
- Improvements are most pronounced in large repositories where packages encapsulate cohesive functionality.
- Developers receive the right class or variable earlier, reducing the number of keystrokes and context switches.
Practical Implications
- Faster Development Cycles – By surfacing the most likely symbols first, developers spend less time hunting for the right name, especially in codebases with many packages.
- Better Live‑Programming Experience – Pharo’s live‑coding environment benefits from more accurate completions, leading to fewer runtime errors caused by mistyped identifiers.
- Scalable Tooling – The heuristic can be adopted by other Smalltalk‑derived IDEs or even ported to languages that expose a package/module hierarchy (e.g., Python, JavaScript).
- Customizable Completion – Teams can tune the package priority (e.g., give higher weight to “core” packages) without rewriting the whole completion engine.
Limitations & Future Work
- Repository Size Sensitivity – In very small projects the three‑tiered lookup may add negligible benefit, and the extra lookup step could introduce a slight latency.
- Cross‑Repository Dependencies – The current design only considers packages within a single repository; handling external dependencies (e.g., via Metacello) remains an open challenge.
- User‑Study Validation – The evaluation relies on offline metrics (MRR). A follow‑up user study would confirm the perceived productivity gains in real coding sessions.
- Extending to Contextual Ranking – Future work could combine package awareness with additional signals (e.g., recent edits, call‑graph proximity) for even smarter suggestions.
Authors
- Omar Abedelkader
- Stéphane Ducasse
- Oleksandr Zaitsev
- Romain Robbes
- Guillermo Polito
Paper Information
- arXiv ID: 2601.05617v1
- Categories: cs.SE
- Published: January 9, 2026
- PDF: Download PDF