[Paper] A Study of Library Usage in Agent-Authored Pull Requests

Published: 1 month ago (December 12, 2025 at 09:21 AM EST)

3 min read

Source: arXiv

Source: arXiv - 2512.11589v1

Overview

Lukas Twist’s study dives into how AI‑driven coding agents handle library imports when they automatically generate pull requests (PRs). By analyzing 26,760 agent‑authored PRs from the AIDev dataset, the work uncovers surprising patterns that matter for anyone building or consuming AI‑assisted development tools.

Key Contributions

Empirical measurement of library import frequency in agent‑generated PRs (≈ 30 % of PRs).
Quantification of new dependency introductions (only 1.3 % of PRs) and the version‑pinning behavior of agents (75 % of added deps specify a version).
Comparison with raw LLM outputs, showing agents are far more disciplined about versioning than “bare” language‑model suggestions.
Catalog of library diversity, revealing that agents draw from a much broader set of external packages than previously reported for non‑agentic LLM code generation.

Methodology

Dataset – The author leveraged the publicly available AIDev corpus, which contains PRs automatically created by a variety of coding agents (e.g., GitHub Copilot, ChatGPT‑based bots).
PR filtering – Only PRs where the author field matched a known agent identifier were kept, resulting in 26,760 PRs across multiple languages and ecosystems (primarily JavaScript/Node, Python, and Java).
Static analysis – For each PR, the changed files were parsed to detect:
- import/require statements (library usage).
- Additions to dependency manifests (package.json, requirements.txt, pom.xml, etc.).
Version extraction – When a new dependency was added, the manifest entry was examined to see if an explicit version constraint was present.
Baseline comparison – A parallel set of human‑written PRs and raw LLM‑generated snippets (without an agent wrapper) were analyzed to contextualize the findings.

The pipeline is fully reproducible and uses off‑the‑shelf parsers (AST‑based for JavaScript/Python, XML parsers for Maven) to keep the analysis approachable for developers.

Results & Findings

Metric	Agent‑authored PRs	Raw LLM snippets (baseline)
PRs that import at least one library	29.5 %	22 %
PRs that add a new dependency	1.3 %	0.4 %
New dependencies with an explicit version	75 %	12 %
Number of distinct libraries referenced	≈ 1,200	≈ 350

Library imports are common but conservative – Agents tend to reuse already‑declared dependencies rather than pulling in fresh packages.
Versioning discipline – When agents do add a new library, they almost always pin a version, reducing the risk of downstream breakage.
Diverse ecosystem reach – The long tail of libraries (many used only once) suggests agents are not stuck on a narrow “favorite” set, unlike earlier studies of plain LLM code generation.

Practical Implications

Tool builders can trust that agent‑mediated PRs are less likely to introduce “dependency hell” compared with raw LLM suggestions, but they should still enforce review gates for any new package addition.
CI/CD pipelines may benefit from lightweight checks that flag un‑versioned dependency additions, a scenario that is now relatively rare but still possible.
Package maintainers can anticipate that AI agents will gradually surface a broader range of libraries, potentially increasing traffic to niche packages.
Developer onboarding – Teams adopting AI coding assistants can focus their policy discussions on when to allow agents to add new deps rather than fearing a flood of uncontrolled imports.

Limitations & Future Work

Language scope – The analysis concentrates on the three most popular ecosystems; behavior could differ for Rust, Go, or .NET.
Agent heterogeneity – The dataset aggregates many agents with varying internal prompts and post‑processing; disentangling individual agent strategies was outside the paper’s scope.
Temporal dynamics – The study captures a snapshot; as agents evolve, their library‑selection heuristics may shift, calling for longitudinal monitoring.

Future research could explore how agents decide which version to pin (latest stable vs. exact) and whether they respect project‑specific dependency policies (e.g., internal mirrors, security scanners).

Authors

Lukas Twist

Paper Information

arXiv ID: 2512.11589v1
Categories: cs.SE
Published: December 12, 2025
PDF: Download PDF

[Paper] A Study of Library Usage in Agent-Authored Pull Requests

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Mini-SFC: A Comprehensive Simulation Framework for Orchestration and Management of Service Function Chains

[Paper] AutoFSM: A Multi-agent Framework for FSM Code Generation with IR and SystemC-Based Testing

[Paper] Visualisation for the CIS benchmark scanning results

[Paper] Coverage Isn't Enough: SBFL-Driven Insights into Manually Created vs. Automatically Generated Tests