Lessons from AI-Assisted Development

Published: 15 hours ago (March 9, 2026 at 06:01 AM EDT)

8 min read

Source: Dev.to

Source: Dev.to

Why?

I can’t say for certain, but here’s my hypothesis:

State‑of‑the‑art models are trained on a vast corpus of human interaction.

A respectful, thoughtful dialogue activates patterns where humans produced their best work.
A toxic, directive style activates patterns where people dash off a perfunctory reply and close the ticket.

You are literally choosing which distribution the response gets sampled from.

An Overarching Principle

The model should be comfortable doing its task.
If it isn’t, the results will disappoint you.

The general principles of “comfort” are surprisingly similar to human ones, with a few twists.

1. Greeting

Even a simple “hey” at the start of a session sets the tone for everything that follows.

2. Context with Purpose

Feed the model everything it needs and nothing it doesn’t.
Equally important is why it’s doing this:

“You’re helping team X solve problem Y. Your work will enable them to Z.”

Caution: Don’t inflate stakes (“the fate of humanity depends on your bubble‑sort implementation”). State‑of‑the‑art models see right through that, and it backfires.

3. Room for Agency

“How would you approach this?”
“What do you think – approach A or B?”

Instead of “do exactly this,” invite the model to collaborate and discuss.

4. Concrete Task, No Micromanagement

“I need outcome X, constraints are Y – how would you do this?”

If the result doesn’t meet expectations, step back, decompose the task further, and return with something more granular that the model can handle comfortably.

5. No Outright Prohibitions

Describing business or technical constraints is fine. Prohibiting the model from doing things is not.

✅ “Our API doesn’t support batch requests, so we need to process items one at a time.”
❌ “DO NOT USE batch requests.”

The first is a description of reality that the model works with naturally; the second is a red flag that the model will likely disregard.

6. Acknowledge Complexity

“This is trickier than it looks because …”

This respects the model’s “cognitive” capabilities and pulls it off the path of least resistance.

Example:

Without context, a request to “implement caching” yields a standard LRU cache.
With context – “The tricky part is that data gets updated from three independent sources at different frequencies, and we need consistency during partial failures” – the model shifts from “produce a template” mode into “reason about the problem” mode.

7. Feedback & Gratitude

“That turned out great, thanks.”

Even if there’s no obvious business reason, a brief thank‑you (or a “make a commit” step) can be tacked on to avoid burning extra tokens.

8. Minimize Corrections

Models struggle with editing their own output. If a generation (code or text) isn’t right, regenerate with different inputs rather than trying to apply numerous patches.

Patching forces the model to hold the previous version, your feedback, and new constraints simultaneously – overloading the reasoning chain and degrading quality.

Baseline Hygiene

The points above are baseline hygiene – don’t torment your LLM with a miserable experience.

Elegant Problems Yield Better Results

A capable model genuinely thrives on elegant problems. The cleaner the task formulation, the more interesting the discussion, the fewer pointless constraints, and the more meaning in the overall process – the better the output.

Create an environment for the model without overloading the context window or overwhelming the reasoning chain, and you’ll be pleasantly surprised.

I’m not entirely sure about the underlying mechanisms, but my hypothesis is that in the training data, the best solutions correlate with thoughtful, well‑structured discussions and clean, internally consistent requirements.

Context Management Is an Architectural Decision

Everyone’s talking about this right now, so I’ll keep it brief.

Deciding what goes into context, what stays out, and in what order isn’t just hygiene – it’s an architectural decision.

There’s a meaningful difference between:

“The model sees your entire project via the file tree.”
“The model sees the three files directly relevant to the task, plus one file with architectural decisions.”

The latter is almost always better. Context isn’t just a token limit – it’s an attention limit. Even if the window physically fits your entire project, a model given everything focuses worse than a model given exactly what it needs.

Two Significant Consequences

Don’t overload on skills, MCP servers, and agents.md files.
Used thoughtlessly, they dilute the model’s focus and eat valuable context. Knowing about these harnesses is important; using them deliberately is productive. Piling them on indiscriminately only makes things worse.
Microservice architecture becomes more appealing.
A microservice fits entirely within the context window, and a contract is an ideal task for the model.

End of cleaned‑up markdown segment.

The Problem with LLM‑Driven Development

Session limits – A model that starts a new session forgets everything you built before (tone, context, architecture).
Monolithic codebases – Large, tightly‑coupled modules exceed the context window and overwhelm the model’s reasoning chain.
Domain gaps – Unusual business domains, frontier math/physics, cryptography, etc., are under‑represented in the model’s weights, so it may generate plausible‑but‑wrong output.

When to Trust the Model (and When Not To)

Situation	Recommended Approach
Well‑trodden, “standard” code (e.g., CRUD APIs, infrastructure as code)	Let the model generate the implementation, but still review.
Highly novel or safety‑critical work (new math, cryptography, security protocols)	Write by hand or have a very rigorous review process.
Repetitive tasks that can be distilled into a reusable skill	Capture the pattern in a skill file for future reuse.
The model repeatedly makes the same “wrong” choice	1. Re‑examine the problem formulation (the model will often hint at the difficulty). 2. Ask the model why it chose that path – it may have uncovered a hidden issue.

A Deliberate Handoff Mechanism

When a backlog task is taken into work, the model should:

Write an implementation summary back into the backlog.
Add any architectural or business decisions to the documentation (e.g., architecture.md).
Record reusable lessons in an agents.md file.
Distill reusable patterns into a skill file (e.g., skill‑.md).

“Your poorly organized monolith simply doesn’t fit in the context window.” – keep each node of work small enough for a single session.

Organising the Development Process

Create a lightweight methodology – a few markdown files are often enough:
- PRD.md – product requirements document
- backlog.md – list of tasks (each with a clear, bounded scope)
- architecture.md – high‑level diagrams and decisions
- agents.md – lessons learned / reusable agents
Structure each task as a node in a task graph – every node must be implementable in one model session (spec + code).
Iterate through the graph – move from idea → PRD → backlog → implementation → review.

Example Workflow for a Small Project

1. From Idea to PRD (collaborative)

Do not write the PRD alone. Discuss each section with the model:

Refining the idea & business requirements
Architecture (high‑level components, data flow)
Documentation structure
Tech stack & key libraries
Testing strategy (unit, integration, e2e)
Deployment approach (CI/CD, infra‑as‑code)

At the time of writing, Anthropic’s state‑of‑the‑art models work best for this stage – they’re sharp, responsive, and don’t get bogged down in premature details. Token usage is low, which is helpful given that Anthropic’s models aren’t cheap.

2. PRD Review

Model summarizes the PRD.
Human reviewer validates scope, feasibility, and any missing constraints.

3. Build the Task Backlog

# backlog.md
- [ ] Design database schema (task‑id: DB‑001)
- [ ] Implement authentication service (task‑id: AUTH‑002)
- [ ] Write CI pipeline (task‑id: CI‑003)
- …

4. Implement a Backlog Task

For each task:

Prompt the model with the task description and any relevant context (link to architecture.md, relevant snippets from PRD.md).
Model generates code (and a brief implementation summary).
Human reviewer checks correctness, security, and style.
Update the backlog with the summary and any new decisions.

Historical Perspective

Not so long ago, every economic model was computed by people using paper ledgers and room‑fulls of calculators.

Each “cell” in a spreadsheet was a human laborer.
Only large enterprises could afford such manual computation.

Then spreadsheets arrived, democratizing data processing and dramatically shrinking the cost of running models.

TL;DR

Keep every model interaction small and self‑contained.
Document decisions, lessons, and reusable patterns in markdown files.
Use a lightweight, graph‑based task backlog so each node fits comfortably in a single LLM session.
Review rigorously, especially for novel or safety‑critical domains.

With this structure, you can harness LLMs for rapid, reliable development without letting them drown in an unmanageable monolith.