[Paper] AgenticAKM : Enroute to Agentic Architecture Knowledge Management

Published: 4 days ago (February 4, 2026 at 06:16 AM EST)

3 min read

Source: arXiv

Source: arXiv - 2602.04445v1

Overview

The paper AgenticAKM: Enroute to Agentic Architecture Knowledge Management tackles a long‑standing pain point for developers and architects: keeping software architecture documentation up‑to‑date. By orchestrating multiple specialized LLM‑driven “agents” that work together to extract, retrieve, generate, and validate architectural knowledge, the authors demonstrate a practical way to automate the creation of Architecture Decision Records (ADRs) directly from code repositories.

Key Contributions

Agentic workflow for AKM – Introduces a multi‑agent pipeline (Extraction, Retrieval, Generation, Validation) that decomposes the complex task of architecture recovery into tractable sub‑tasks.
Prototype for ADR generation – Implements the workflow on real‑world GitHub repositories, automatically producing ADRs that capture design decisions.
Empirical user study – Evaluates the approach on 29 open‑source projects, showing higher quality ADRs compared with a single‑prompt baseline.
Open discussion of prompt engineering limits – Highlights why a naïve “one‑prompt‑fits‑all” strategy fails for distributed architectural knowledge.

Methodology

Problem decomposition – The authors view architecture knowledge management as a series of steps rather than a monolithic query.
Specialized agents
- Extraction Agent scans the codebase (e.g., build files, configuration, source code) and pulls out low‑level artefacts (components, dependencies, patterns).
- Retrieval Agent searches existing documentation, issue trackers, and commit messages to locate any prior architectural rationale.
- Generation Agent feeds the collected artefacts into an LLM prompt that drafts an ADR, following a standard template (Context, Decision, Status, Consequences).
- Validation Agent runs consistency checks (e.g., does the ADR reference existing code? Are required fields filled?) and asks the LLM to refine the draft if needed.
Iterative loop – If validation fails, the Generation Agent is invoked again with additional context, mimicking a human reviewer’s back‑and‑forth.
Implementation – The prototype uses OpenAI’s GPT‑4 API, a simple file‑system crawler, and a vector store for retrieval. The whole pipeline is orchestrated with a lightweight task‑queue.

Results & Findings

Quality boost – In the user study, 78 % of the ADRs produced by AgenticAKM were rated “useful” or “very useful” by participants, versus 52 % for the single‑prompt baseline.
Reduced manual effort – Participants reported a 40 % drop in time spent writing ADRs when they could start from the agent‑generated drafts.
Higher coverage – The multi‑agent system uncovered architectural decisions that were completely missing from the original documentation in 6 of the 29 repositories.
Prompt length management – By splitting the problem, each LLM call stayed well within token limits, avoiding the truncation issues that plagued the naïve approach.

Practical Implications

Automated ADR pipelines – Teams can plug AgenticAKM into CI/CD to continuously generate or update ADRs as code evolves, keeping documentation in sync without extra overhead.
On‑boarding acceleration – New hires get instant, LLM‑generated summaries of key design choices, shortening the learning curve.
Compliance & audit readiness – Regular, machine‑produced architecture records help satisfy regulatory or internal governance requirements.
Extensible to other artefacts – The same agentic pattern could be repurposed for generating design docs, API contracts, or migration guides, making it a reusable building block for knowledge automation.

Limitations & Future Work

LLM hallucination risk – Although the Validation Agent mitigates obvious errors, the system can still produce plausible‑but‑incorrect rationales, especially when source code lacks clear patterns.
Domain specificity – The prototype was evaluated on open‑source Java/JavaScript projects; performance on legacy codebases, micro‑service ecosystems, or low‑level systems remains untested.
Scalability of retrieval – The current vector store works for modest repositories; larger monorepos may need more sophisticated indexing and chunking strategies.
Future directions – The authors plan to (1) integrate static analysis tools for richer artefact extraction, (2) experiment with fine‑tuned LLMs to reduce hallucinations, and (3) broaden the evaluation to industrial settings with stricter security constraints.

Authors

Rudra Dhar
Karthik Vaidhyanathan
Vasudeva Varma

Paper Information

arXiv ID: 2602.04445v1
Categories: cs.SE
Published: February 4, 2026
PDF: Download PDF

[Paper] AgenticAKM : Enroute to Agentic Architecture Knowledge Management

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Characterizing and Modeling the GitHub Security Advisories Review Pipeline

[Paper] When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models

[Paper] Toward Quantum-Safe Software Engineering: A Vision for Post-Quantum Cryptography Migration

[Paper] A Bayesian Optimization-Based AutoML Framework for Non-Intrusive Load Monitoring