🧠Maybe I Just Do Not Get It!

Published: (December 2, 2025 at 01:08 AM EST)
5 min read
Source: Dev.to

Source: Dev.to

The uncomfortable feeling of being the skeptic in an optimistic room

I have been working with AI for a while now—deep in it, shipping things, wiring models into products, watching logs at 3 am, apologizing to users when something weird happens. Yet every time I see a post that says:

“Autonomous agents will run your business for you.”

I hear a quiet, uncomfortable voice in my head:

“Maybe I am the one who does not get it.”

Everyone seems confident that we can simply:

  • Plug a model into a tool layer
  • Wrap it in a few prompts
  • Let it call APIs

and watch it “run operations” or “manage your company.”

Meanwhile I’m asking questions that sound almost boring:

  • Are we actually giving this thing real power over important decisions?
  • What exactly is our control surface?
  • What happens when it fails in an unanticipated way?
  • Do we really think prompt = governance?

Part of me worries I’m being too cautious, too old‑school, too stuck in an “engineering brain” while the world evolves into something more fluid and probabilistic. Another part thinks:

“Maybe someone has to say out loud that prompts are not control.”

So here is my attempt—half confession, half technical rant, fully self‑doubting.

Prompting feels like control, but it is not

When you write a prompt, it feels like you are writing a policy:

You are an operations assistant. You always follow company policy.
You never issue a refund above 200 EUR without human approval.
You always prioritize customer safety and data privacy.

It looks like a rule set, a mini spec. Underneath, however, you are merely feeding natural‑language text as conditioning into a statistical model that has:

  • No strict guarantee that those words will be followed
  • No built‑in concept of “policy” or “violation”
  • No deterministic execution path
  • No awareness of the “blast radius” of a single wrong action

You give it text; the model gives you back more text. The feeling of control comes from you, not from the system. The model does not “know” that those instructions are sacred; it only knows patterns in its weights that say: “When the input looks like this, texts like that often follow.” The gap between those two things is exactly where a lot of risk lives.

What real control looks like in software systems

If we set aside the AI hype and think like a backend engineer, “control” has always meant things like:

Permissions

  • Which identity can do what, where, and how often

Boundaries

  • Network segments, firewalls, read‑only vs. read‑write access, rate limits

Auditability

  • Who did what, when, and using which parameters

Reversibility

  • Can we undo this operation? Can we restore from backup?

Constraints and invariants

  • Account balance must never be negative
  • Orders must always have a valid user ID and product ID
  • This service cannot push more than X updates per second

On top of all that we layer:

  • Monitoring
  • Alerts
  • Fallback paths
  • Kill switches
  • Change management

It is tedious and unsexy, but it is what makes serious systems not collapse.

Now compare that with “control” for AI agents:

  • “We wrapped the model with a prompt that tells it to be safe.”
  • “We added a message saying: if unsure, ask a human.”
  • “We configured a few tools and let the agent decide which to call.”

There is a huge gap between these two worlds. I keep asking myself:

“Am I overreacting by demanding similar rigor for AI automation?
Or are we collectively underreacting because the interface is so friendly and the output is so fluent?”

Why prompt‑based control is fragile

Non‑determinism

Calling the same model with the same prompt and temperature 0.7 ten times yields:

  • Slightly different reasoning chains
  • Occasionally very different answers
  • Sometimes rare but catastrophic failure modes

This variability is acceptable in a chat setting, but far less so when the output:

  • Approves or denies a refund
  • Decides whether to escalate a compliance issue
  • Sends an email to an important customer

If your “policy” lives only in the prompt, the model can randomly deviate when a token path goes weird.

Context dilution and instruction conflicts

In complex agent setups, the model’s context looks something like:

System messages ("You are X")
Task instructions
Tool specs
History, previous steps, errors
User messages
Tool responses

Your carefully written safety instructions can:

  • Get buried deep in the context
  • Be overshadowed by later messages
  • Conflict with tool descriptions
  • Interact strangely with user input

You cannot be sure which instruction wins inside the model’s internal weighting; you are left hoping that the most important part is loud enough.

Distribution shift and weird edge cases

The model was trained on static data, then thrust into:

  • An evolving product
  • Changing user behavior
  • Novel business processes
  • Adversarial inputs from clever users

What you observed in internal tests is not a formal guarantee; it is merely evidence that under some sampled conditions the model behaved “well enough.” One weird edge case can cause a big problem.

Lack of grounded state and formal rules

Traditional systems have explicit state machines and rules that can be formalized, proved, or at least reasoned about. AI agents usually lack:

  • A formal internal model of the environment
  • A provable decision process
  • Compositional guarantees

If you want real control, you need to build it around the models, not inside the prompt. Which raises the question: Why are so many people comfortable skipping that part?

The three A’s: Automation, autonomy, authority

It helps to separate three concepts that marketing often blends together.

Automation

This is what we have done for decades:

  • Cron jobs
  • Scripts
  • Pipelines
  • Daemons

Automation means: “This specific routine step is handled by a machine.”

(Continuation of the discussion on autonomy and authority follows in the original talk.)

Back to Blog

Related posts

Read more »