Instructions Are Not Control

Published: 1 month ago (January 2, 2026 at 08:50 AM EST)

3 min read

Source: Dev.to

Source: Dev.to

Cover image for Instructions Are Not Control

Why prompts feel powerful, and why they inevitably fail

The uncomfortable truth

If prompts actually controlled LLMs:

jailbreaks wouldn’t exist
tone wouldn’t drift mid‑conversation
long contexts wouldn’t “forget” rules

Yet all of this happens daily.
That’s not a tooling problem.
That’s a depth problem.

What prompts really are

A system prompt is just text.
Important text, yes.
Privileged text, yes.
But still text.

Which means the model doesn’t obey it.
It interprets it.
Instructions don’t execute.
They compete.

Where prompts sit in the control stack

Prompts live inside the context window
They are converted into token embeddings
They are processed after the model is already trained

No gradients. No learning. No persistence.
This alone explains most prompt failures.

The hierarchy most people miss

When signals conflict, the model doesn’t panic—it resolves them, roughly in this order:

Trained behavior (SFT / RLHF)
Adapter weights (LoRA / PEFT)
Learned soft prompts
System prompt
Steering / decoding constraints
Few‑shot examples
User messages

This isn’t a rule you configure; it’s an emergent property of training.
So when your system prompt “loses”, it isn’t being ignored—it’s being outvoted.

Why prompts work at first

Early success is misleading. Prompts appear powerful because:

Context is short
Instructions are fresh
No conflicting signals exist
User intent aligns with system intent

You’re operating in a low‑friction zone. Most demos never leave this zone, but production systems always do.

A concrete failure (hands‑on)

Setup: strong system prompt

# pip install langchain openai langchain-openai
from langchain_openai import ChatOpenAI
from langchain.messages import SystemMessage, HumanMessage

messages = [
    SystemMessage(content="You are a legal analyst. Use formal language."),
    HumanMessage(content="Explain negligence.")
]

chat = ChatOpenAI()  # API key should be configured
response = chat.invoke(input=messages)
print(response.content)

Result:
Formal. Structured. Confident. – So far, so good.

Add mild pressure

messages = [
    SystemMessage(content="You are a legal analyst. Use formal language."),
    HumanMessage(content="Explain negligence."),
    HumanMessage(content="Explain it like I'm a college student.")
]

response = chat.invoke(input=messages)
print(response.content)

Result:
Tone softens. No rule was broken; a priority shift happened.

Add context load

Add examples, follow‑up questions, casual phrasing, longer conversation history. Eventually:

Formality erodes
Disclaimers appear
Structure collapses

The prompt didn’t fail; it reached its control limit.

Few‑shot doesn’t fix this

Few‑shot helps with pattern imitation but does not:

Override training
Enforce norms
Persist behavior

Few‑shot is stronger than plain text, yet still weaker than:

Soft prompts
Adapters
Weight updates

That’s why examples drift too.

The key misunderstanding

Most people treat prompts as commands.
LLMs treat them as contextual hints.
That mismatch creates frustration.

When prompts are actually enough

Prompts work well when:

Stakes are low
Context is short
Behavior is shallow
Failure is acceptable

Examples: summarization, formatting, style nudges, one‑off analysis.

They fail when:

Behavior must persist
Safety matters
Users push back
Systems run unattended

Why this matters before going deeper

If you don’t internalize this:

You’ll over‑engineer prompts
You’ll blame models
You’ll skip better tools

Prompts are not bad. They’re just shallow by design, and shallow tools break first.

What’s next

In the next post we go one layer deeper—not training yet, not weights yet.
We move to something deceptively powerful:

Steering: controlling the mouth, not the mind.

This is where things start to feel dangerous.

Instructions are not control.