I gave 8 AI agents an island and watched a society emerge — wars, gossip, grudges, and peace
Source: Dev.to

[](https://dev.to/dhrupo)
Tiny Civilization: what happens when AI agents have to live together
I grew up on Age of Empires, Sid Meier’s Civilization, and Rise of Nations. The thing that hooked me was never the graphics — it was the systems. You set a few rules in motion and a whole world spills out of them: economies, rivalries, alliances, betrayals.
Years later I watched OpenAI’s hide-and-seek multi-agent video (writeup), where agents that were only rewarded for hiding and seeking invented tools and counter-strategies nobody coded — ramps, box-surfing, fort-building. Emergent behavior from simple pressure. That broke something open for me.
So I asked a smaller question: forget winning a game — what if AI agents just had to live in a society together? Would they behave like us? Hold grudges? Gossip? Make peace because they’re tired of fighting?
That became Tiny Civilization — a browser sim where 2–8 agents with distinct personalities live on a small island, gathering, building, trading, stealing, gossiping, holding grudges, making peace, and remembering it all across lives.
👉 Live demo — runs keyless in “instinct mode,” or plug in a key for LLM minds.
The whole thing — every line — was built with Claude Code, using the Fable model, right before Fable retired. It felt fitting to send a storytelling model off by having it build a world full of little stories.
The problem: pure-LLM agents are bankrupting and pure-utility agents are boring
The first design decision was the hardest. Two obvious options, both bad:
Call the LLM every tick. Every agent, every day, makes an API call. Beautiful, expressive — and it costs a fortune and crawls.
Pure utility AI (the classic RTS approach). Fast and free, but agents can’t scheme, can’t talk, can’t surprise you. It’s just min-maxing.
So I split the brain in two:
Layer Decides Cadence Cost
LLM mind
Strategy (gather/build/trade/befriend/aggress/reconcile/defend), per-neighbor stances, an inner thought, and all dialogue
~every 15 sim-days
~150 calls / 1,000 days
Utility engine Each day’s concrete action — eat, sleep, gather, steal, attack, gift, trade, make peace every tick free, local
The LLM declares intent — “aggress against Kai, he raided my base” — and that biases the utility scores for the next two weeks. The body runs on instinct (hunger, energy, storms); the mind sets direction. This is the trick that makes it both affordable and alive.
Memory across lives — where it got strange
When a run ends, each agent’s life is distilled into memory lines:
-
“you won with score 200”
-
“Maya destroyed your home”
-
“you and Kai made peace after a feud”
-
“this life hardened you — you trust less now”
Stored in localStorage, keyed by agent name, and injected into next run’s prompts. Agents start referencing past lives in dialogue, pre-emptively paying reparations to remembered enemies, trusting remembered allies — sometimes to their own ruin.
How I actually built and balanced it
This is the part I’m proudest of, and it’s pure childhood-strategy-game energy: you can’t balance a society by vibes. So the workflow was:
A pure, deterministic simulation core — zero DOM, zero AI. The same runTick powers the browser, the tests, and a batch runner.
A seeded experiment runner. npm run experiment -- --runs 30 --days 1000 --seed 1 runs 30 reproducible lifetimes and spits out a win-rate/score table. Every balance change landed with a before/after table. (Example: a Hermit rebalance moved one agent from 0/30 wins to 9–11/30 without breaking the other archetypes.)
A 16-gate regression suite. The justification gate (no grievance → no violence), war burnout, reconciliation pricing, positive-sum trade, granary protection, homelessness-death, trait drift — each one locked behind a headless test so balance changes can’t silently regress behavior.
Change a dial in constants.ts → run the experiment → read the table. That was the entire loop.
What emerged (none of this is scripted)
Running the same island over and over, with memory on, produced a coherent arc:
Massacres. Early on, the warrior just killed everyone. No deterrence existed.
Forever wars. I added a justification gate (violence needs a real grievance — theft, attack, trespass). That fixed unprovoked killing… but now wars never ended: 495 fruitless attacks across 1,500 days.
Diplomacy. Reconciliation + escalating reparations + war-weariness made endings inevitable. Attacks per 2,000-day run collapsed: 594 → 14 → 0.
The kleptocracy. With war capped, theft became the unpunished crime — 340 thefts/run. I fixed it the human way: granaries. Fortification, not punishment.
The golden age. A clean-slate run, no memories: zero attacks in 1,000 days, and the Warrior won by out-trading everyone (118 trades, 1 attack).
The fall. The very next run — now remembering that golden age — collapsed. Remembered trust lowered everyone’s guard, which raised the payoff of betrayal. Scores dropped ~15%; every relationship ended negative. Peace between strangers turned out to be easier than peace between old friends with open tabs.
The recurring lesson: every time I patched one form of conflict, the agents found the next-cheapest one. Massacres → wars → theft → litigation. Exactly like us.
Stack
TypeScript, React, Zustand, Vite, Recharts. Default mind is z.ai GLM, but any OpenAI-compatible provider works per-agent — so you can literally pit Claude vs GLM vs Gemini in the same village and watch model-vs-model diplomacy. Keys never touch the browser (server-side proxy), and an adaptive-pacing controller learns each key’s real rate ceiling.
Try it: https://multiagentciv.netlify.app/
Code: https://github.com/dhrupo/multi-agent-civilization
If you played the same strategy games I did, I think you’ll feel right at home watching this thing run.