[Paper] Configuring Agentic AI Coding Tools: An Exploratory Study
Source: arXiv - 2602.14690v1
Overview
The paper Configuring Agentic AI Coding Tools: An Exploratory Study examines how developers actually set up the newest generation of “agentic” AI assistants—tools that can run autonomously, fetch data, and even invoke sub‑agents to complete coding tasks. By mining thousands of open‑source repositories, the authors map out the real‑world configuration practices that make these agents work, shedding light on emerging standards and the gaps that still need to be filled.
Key Contributions
- Taxonomy of configuration mechanisms – identifies eight distinct ways developers can steer agentic coding tools (e.g., Context Files, Skills, Subagents).
- Large‑scale empirical snapshot – analyzes 2,926 GitHub repos that use Claude Code, GitHub Copilot, Cursor, Gemini, or Codex, quantifying adoption rates for each mechanism.
- Discovery of an emerging interoperable standard – the
AGENTS.mdfile surfaces as a de‑facto cross‑tool format for declaring context. - Insight into “configuration cultures” – shows how different tool ecosystems favor different mechanisms (Claude Code users employ the widest variety).
- Baseline for future longitudinal and experimental work – provides the first systematic measurement of how configuration choices affect agent performance.
Methodology
- Tool selection – The study focuses on the five most popular agentic coding assistants that expose repository‑level configuration (Claude Code, GitHub Copilot, Cursor, Gemini, Codex).
- Data collection – Using the GitHub REST API, the authors harvested all public repositories that contain any of the known configuration artifacts (e.g.,
*.json,*.mdfiles named according to each tool’s spec). This yielded 2,926 distinct projects. - Classification – Each repository was manually labeled for the presence of the eight configuration mechanisms, with special attention to the three cross‑tool mechanisms (Context Files, Skills, Subagents).
- Quantitative analysis – Frequency counts, co‑occurrence matrices, and per‑tool breakdowns were generated to spot trends.
- Qualitative inspection – A sample of “Skills” and “Subagents” files was examined to understand whether they contain static prompts or executable workflows.
The approach balances breadth (thousands of repos) with depth (manual inspection of a representative subset), making the findings reliable for both researchers and practitioners.
Results & Findings
| Finding | What the data show |
|---|---|
| Context Files dominate | Over 70 % of all repositories include at least one Context File; in many cases it is the only configuration artifact. |
AGENTS.md as a lingua franca | This Markdown‑based file appears in 42 % of the sampled repos and is accepted by all five tools, hinting at an emerging standard. |
| Shallow adoption of advanced mechanisms | Only ~15 % of repos define a Skill, and ~8 % define a Subagent. When present, they usually contain a single static instruction rather than a multi‑step workflow. |
| Tool‑specific cultures | Claude Code users employ the full spectrum of mechanisms (average 3.2 per repo), while Copilot and Gemini users stick mostly to Context Files. |
| Artifact sparsity | The majority of repos (≈60 %) define just one configuration file; multi‑artifact setups are rare. |
These patterns suggest that developers are still in the early adoption phase: they rely heavily on simple context provisioning and have yet to explore the richer, programmable capabilities that agentic tools promise.
Practical Implications
- Standardize on
AGENTS.md– Teams looking to future‑proof their codebases can adopt this Markdown format now; it works across the major agents and reduces tool‑lock‑in. - Start simple, iterate – Since most projects succeed with a single Context File, developers can get immediate value by curating relevant files, dependencies, and environment hints without writing complex Skill scripts.
- Invest in reusable Skills – The low adoption of executable Skills points to an opportunity: libraries of ready‑to‑run workflows (e.g., “run tests → refactor → commit”) could dramatically boost productivity once they become more mature.
- Tool selection matters – If a team wants to experiment with sophisticated orchestration (multiple Subagents, dynamic pipelines), Claude Code currently offers the richest ecosystem. Conversely, for lightweight assistance, Copilot or Gemini may be sufficient.
- Monitoring agent performance – The study provides a baseline; developers can now track how adding a new configuration artifact (e.g., a Skill) changes metrics like code suggestion relevance, build success rate, or developer cycle time.
Limitations & Future Work
- Snapshot in time – The analysis captures a static view of repositories; configuration practices may evolve rapidly as tools release new features.
- Public‑repo bias – Private or enterprise codebases, which might use more advanced configurations, are not represented.
- Performance correlation missing – The study does not directly measure how different configurations affect the quality or speed of the agents’ output; future experiments should link configuration choices to concrete productivity metrics.
- Tool coverage – Only five agents were examined; emerging platforms (e.g., Anthropic’s Claude 3, Meta’s Llama‑Code) could introduce new mechanisms.
By addressing these gaps, subsequent research can turn the current descriptive baseline into actionable guidelines for building high‑performing, agent‑driven development pipelines.
Authors
- Matthias Galster
- Seyedmoein Mohsenimofidi
- Jai Lal Lulla
- Muhammad Auwal Abubakar
- Christoph Treude
- Sebastian Baltes
Paper Information
- arXiv ID: 2602.14690v1
- Categories: cs.SE
- Published: February 16, 2026
- PDF: Download PDF