Using a Single GPT Client as a Language Runtime (No API, No Agents)

Published: (December 20, 2025 at 09:14 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Overview

Most examples of LLM usage on DEV focus on one of three things:

  • Prompt tricks
  • Tool calling / agents
  • Backend‑heavy workflows

This post explores a different idea: what if a single GPT client could behave like a lightweight, auditable runtime—purely through interaction design? No APIs.

The problem: chat is a poor execution model

Chat‑style prompting is flexible, but it has structural weaknesses:

  • Outputs vary across runs
  • Decisions are hard to audit
  • Missing inputs don’t block execution

For decision‑oriented tasks (investment checks, risk screening, stop‑loss decisions), this is a serious problem. The issue is not intelligence.

The core idea: language‑level runtime

In software, a runtime enforces three things:

  1. Input contracts
  2. Execution order
  3. Output structure

Instead of building a new framework, I tried enforcing these constraints directly in natural language, inside a GPT client. The result behaves surprisingly like a runtime.

Step 1: protocol binding (runtime header)

Every session begins with a minimal header:

protocol: yuerdsl

Think of it as a language‑level execution gate.

Step 2: strict input contracts (DSL as a form)

Users don’t “ask questions.”
Key rule: No completed template → no decision output

This alone eliminates most hallucination‑driven conclusions.

Step 3: fixed execution pipeline

Once the template is complete, the runtime executes a fixed pipeline:

  1. Stage detection
  2. State compilation
  3. Structural risk analysis
  4. Decision grading (PASS / WATCH / STOP)
  5. Action list
  6. Audit receipt

There is no branching logic exposed to the user.

Step 4: auditable output

Each run ends with an audit receipt containing:

  • Input digest
  • Key variables
  • Assumptions
  • Decision grade
  • Action priorities

This makes runs comparable and replayable. Same input → same structure → same decision grade.

Isn’t this just prompt engineering?

Not really. Prompt engineering optimizes what to say. Here the model is required to be consistent rather than clever.

Why not agents or tools?

Agents and tools are powerful, but they add complexity:

  • Tool failure modes
  • State synchronization
  • Backend dependencies

This experiment intentionally asks a narrower question: How far can we go with zero infrastructure, using only protocol design? For lightweight, client‑only scenarios, the answer is surprisingly far.

Why GPT (the client)?

This is an engineering choice, not a brand preference. At the moment, GPT offers:

  • Stable adherence to long structured instructions
  • Reliable parsing of form‑like input
  • A consistent client‑side execution environment

The approach itself is model‑agnostic.

What this experiment shows

LLMs are probabilistic. With strict contracts and refusal rules, you can get:

  • Repeatable decisions
  • Clear failure modes
  • Human‑verifiable traces

That’s often enough for real‑world use.

Final thought

Instead of asking:

“How do we make LLMs smarter?”

It may be more productive to ask:

“How do we make them accountable?”

Sometimes, better constraints beat bigger models.

Project / DSL schemas

Back to Blog

Related posts

Read more »