Coding Agents for Software Engineers

Published: (February 22, 2026 at 08:02 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

1️⃣ What Is a Coding Agent?

A coding agent is not just an LLM. It is a system:

IDE / CLI

Agent Runtime

Context Builder

LLM Inference

Tool Execution (fs, git, tests, shell)

Loop
  • The model is only the reasoning engine.
  • The runtime handles orchestration.

2️⃣ General Architecture of a Coding Agent

A production‑grade coding agent includes:

  1. Indexing Layer

    • Repo scanning
    • Symbol extraction
    • Dependency graph
    • Optional embeddings
  2. Context Builder

    • Select relevant files
    • Inject instructions
    • Add plan / scratchpad
    • Add recent edits
  3. LLM Inference Layer

    • Tokenized prompt
    • Context‑window constraints
    • Streaming output
  4. Tool Layer

    • File read/write
    • Test execution
    • Git diff/patch
    • Lint / build commands
  5. Loop Controller

    • Plan
    • Execute
    • Validate
    • Iterate

The model does not “see the repo.” The agent chooses what to send.


3️⃣ What Is the Context Window?

The context window is the maximum number of tokens the model can attend to in a single inference call. It includes:

System instructions
+ AGENTS.md / policies
+ Scratchpad / plan files
+ Relevant source files
+ Recent conversation
+ Tool outputs
+ Your current request
+ Model output

Everything must fit inside the window. A larger window does not mean you should send everything.


4️⃣ Where Does Tokenization Happen?

Typically:

  • The agent runtime tokenizes locally (client‑side).
  • It estimates token usage before calling the model.
  • The server still processes tokens during inference.

Why client‑side tokenization matters

  • Avoid exceeding context limits
  • Control cost
  • Control chunking
  • Optimize file selection

5️⃣ What Actually Consumes Tokens?

In coding workflows, token cost usually comes from:

  • Large source files
  • Test files
  • Logs
  • Replayed conversation history
  • Repeated system instructions
  • Scratchpad growth

Your instruction verbosity is rarely the main cost—file selection is.


6️⃣ What Makes “Good Quality” Context?

Good context is:

  • Relevant – only include files that matter.
  • Structured – clear task → constraints → deliverable.
  • Deterministic – explicit scope boundaries.
  • Minimal but sufficient – no narrative fluff, no repeated architecture explanation.

Bad context includes:

  • Entire repo dump
  • Long emotional explanations
  • Old irrelevant chat history
  • Ambiguous instructions

7️⃣ What Actually Improves Coding Responses?

1️⃣ Clear Scope

Bad: “Improve authentication system.”

Good:

Scope:
- src/auth/*
- src/middleware/auth.ts
Do not touch:
- public API
- schema definitions

2️⃣ Explicit Constraints

Examples:

  • Do not change public interfaces.
  • Preserve test behavior.
  • No new dependencies.
  • Keep diff minimal.

Constraints reduce hallucinated refactors.

3️⃣ Defined Output Format

Deliverable:
- Unified diff only
- Brief explanation (The model does **not** remember it; the agent injects it into context each time.)

9️⃣ Efficient Project Structure for Coding Agents

Recommended layout:

/AGENTS.md        # Global behavior rules (minimal)
/PLAN.md          # Task plan (editable)
/src/...
/tests/...

AGENTS.md should contain

  • Coding standards
  • Test commands
  • “Plan first” rule
  • Guardrails

Keep it short; it is injected often.


🔟 Efficient Coding Agent Usage Patterns

Pattern A — Constrained Patch

Task:
Optimize middleware performance.

Scope:
src/auth/middleware.ts

Constraints:
- Preserve API
- No new deps

Output:
Unified diff only.

Pattern B — Incremental Execution

Implement only Step 1 from PLAN.md.
Run tests.
Update PLAN.md.
Stop.

Pattern C — Scope Locking

Explicitly limit directories:

Touch only:
src/auth/*
Do not modify:
src/db/*

This prevents token waste and unintended edits.


1️⃣1️⃣ What NOT to Do

  • ❌ Send the whole repo
  • ❌ Re‑explain system architecture every turn
  • ❌ Let scratchpads grow unbounded
  • ❌ Leave scope ambiguous
  • ❌ Ask for “improve everything”

1️⃣2️⃣ Big Context Myth

A 1 M‑token context window does not mean you should send 1 M tokens, nor that it will be faster or more accurate.

Longer context:

  • Increases latency
  • Increases cost
  • Increases noise risk

Smart context selection beats raw size.


1️⃣3️⃣ Mental Model for Engineers

Treat coding agents like this:

LLM = Stateless reasoning engine
Context = Input data packet
Agent = Orchestrator
Scratchpad = External memory

Your job: optimize the data packet.


1️⃣4️⃣ Core Optimization Principles

  • Structure > verbosity
  • Relevance over sheer volume

completeness

  • Constraints > freedom
  • Iteration > giant prompts
  • Plan → execute → verify

Final Takeaway

Coding agents perform best when:

  • The task is clearly scoped
  • Constraints are explicit
  • Context is curated
  • Plans are externalized
  • History is pruned
  • Output format is constrained
0 views
Back to Blog

Related posts

Read more »

A Discord Bot that Teaches ASL

This is a submission for the Built with Google Gemini: Writing Challengehttps://dev.to/challenges/mlh/built-with-google-gemini-02-25-26 What I Built with Google...

AWS who? Meet AAS

Introduction Predicting the downfall of SaaS and its providers is a popular theme, but this isn’t an AWS doomsday prophecy. AWS still commands roughly 30 % of...