Write Powerful System Prompts | Prompt Engineering - Build AI Platforms From Scratch #3

Published: 1 month ago (December 12, 2025 at 04:30 PM EST)

5 min read

Source: Dev.to

What is Prompt Engineering?

And how does it make or break AI platforms?

Prompt engineering is the discipline of designing instructions that control AI behavior. Your prompt is the blueprint for how an AI interprets requests and generates responses.

In AI platforms, prompts aren’t just suggestions—they’re the entire control mechanism.

Bad prompts → unpredictable outputs, hallucinations, parsing failures, user frustration.
Good prompts → consistent behavior, reliable outputs, maintainable systems, scalable platforms.

The difference between a working AI product and a broken one often comes down to prompt quality:

Vague prompts → AI invents its own interpretations.
Contradictory prompts → AI picks randomly between conflicting instructions.
Well‑structured prompts → AI follows rules consistently.

Prompts are architectural decisions that determine whether your AI layer functions reliably or fails under edge cases.

An Example from My Project

Short example from Emstrata

The World Builder is a foundational prompt in Emstrata that allows users to create custom narrative worlds based on their inputs, complete with characters, locations, items, and a coherent reality.

System prompt inputs

user-msg
title
prefs
genre
arc

World Builder outputs

prose("")
basis("")
char("name", "desc", "state")
Item("name", "desc", "state")
location("name", "desc", "state")

How to Structure a System Prompt

Achieving maximum effectiveness with your prompts

Split your prompt into modules. Each one handles a distinct concern:

Core Identity – what is this AI (chatbot, research agent, etc.) and what does it do.
Platform Specifics – context about where/how it operates, if that relates to the output.
Understanding Role – scope, responsibilities, boundaries.
Dissecting Requests – how to parse incoming data.
Response Expectations – exact output format with function calls.
Quality Standards – non‑negotiable benchmarks for output.

Request Structure

What I suggest

Requests should be expressed as clear key‑value pairs.

user-input: "This is what the user said!"
convo-summary: "This is a breakdown of what was said previously"

Multiple pairs can be listed; the input can be split as needed, provided it remains logical and the AI can understand the intent after defining the input elements.

Response Structure

What I suggest

Structured Output – returning predefined functions instead of free‑form text aids parsing, logical coherence, AI understanding, and prevents hallucinations.

Argument Definition – define arguments beforehand and enforce correct order and types (e.g., string, number, likelihood/1000).

Example predefined functions

try("Outcome 1", "Outcome 2", 100/1000)
# Means: 10% chance outcome 1 succeeds, 90% chance outcome 2 happens.

attack(damage, 20)
# Means: the attack does 20 points of damage. (damage is a keyword, not a string)

speak("this is some example dialogue")
# Represents example dialogue coming from the AI.

What is a Function

For non‑coders

A function is a preset command that can hold arguments which determine specific outcomes.

Anatomy: functionName(argument1, argument2, argument3)
Function Name: the command type (e.g., speak, create, attack, move).
Parentheses () – container holding the arguments.
Arguments – values that determine what actually happens, separated by commas, in a specific order.

Example

speak("Hello there", cheerful)

speak = command type
"Hello there" = what gets spoken
cheerful = how it’s spoken

Modularity

The benefits of breaking system prompts down

Modules are independently updatable—fix one without rewriting everything. Think of it like code architecture with a separation of concerns.

Advantages

Independent updates – fix one module without touching others.
Reusability – drop the same module into different prompts.
Clarity – each module has one job, easy to understand.
Collaboration – team members can work on different modules simultaneously.
Debugging – isolate issues to specific modules instead of hunting through a massive block of text.

Inputs and Outputs

How these make or break a system prompt

Input only what you absolutely need to get the expected output. Extra info may deprioritize more important inputs.
- Example hierarchy: user-input > convo-history > saved-prefs.
Output only the necessities as well. The more elements generated, the higher the chance of confusion or deprioritization.
Iterative refinement – cycle through building → reducing → building → reducing. You’ll often discover you were overthinking both parts.

Defining Rules

Ensuring rules are stressed and enforced

Emphasize critical rules (ALL CAPS, repetition, strategic placement at the beginning or end of modules).
Eliminate contradictory instructions. Conflicting cues (e.g., “be concise” vs. “provide extensive detail”) cause the AI to pick arbitrarily.
Be explicit about constraints—what the AI cannot do is as important as what it can do.
Define argument types strictly and repeat them (e.g., string "text in quotes", number without quotes, likelihood num/1000).

Bad rule example: “Respond appropriately.”

Good rule example: “You must respond using only the functions defined in the Response Expectations Module. Do not invent new functions or arguments.”

Preventing Hallucinations

Techniques to keep AIs on track

Restricting AIs to a specific formatted response is one of the best ways to avoid flawed outputs, e.g.:

response("this is an example of a response", "This is the second argument to this function")

Reiterate that the AI cannot make up its own functions or arguments.
Be hyper‑specific about requirements; eliminate contradictory prompting that may confuse the AI.
Use ALL CAPS when you need to stress an aspect to the AI.

Trial & Error

Iteration is absolutely necessary to achieve good results

Different LLMs react differently to prompting styles. Some follow rules better than others.
Testing, iterating, and saving updates are essential for reliable performance.
Consider model size relative to task complexity—a massive response with complex logic may require a larger (and likely more expensive) model.

Major Takeaways

What to remember

Modular architecture beats monolithic blocks every time. Easier to debug, reuse, and collaborate on.
Define clear input/output structures using key‑value pairs and preset functions. Ambiguity kills AI performance.
Minimize both inputs and outputs to absolute essentials. More complexity = more confusion = worse results.
Formatted responses (function calls with strict argument types) are your best defense against hallucinations.
Test, reduce, test again. Your first dra

📺 View this module with video & slides