I gave 8 AI agents an island and watched a society emerge — wars, gossip, grudges, and peace

Published: (June 14, 2026 at 03:30 AM EDT)
5 min read
Source: Dev.to

Source: Dev.to

Cover image for I gave 8 AI agents an island and watched a society emerge — wars, gossip, grudges, and peace

              [![Dhrupo Nil](https://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1151170%2Ff3f2b324-73c2-4df0-9f8a-ffc109194198.jpeg)](https://dev.to/dhrupo)
              
              
            
      
            

Tiny Civilization: what happens when AI agents have to live together

I grew up on Age of Empires, Sid Meier’s Civilization, and Rise of Nations. The thing that hooked me was never the graphics — it was the systems. You set a few rules in motion and a whole world spills out of them: economies, rivalries, alliances, betrayals.

Years later I watched OpenAI’s hide-and-seek multi-agent video (writeup), where agents that were only rewarded for hiding and seeking invented tools and counter-strategies nobody coded — ramps, box-surfing, fort-building. Emergent behavior from simple pressure. That broke something open for me.

So I asked a smaller question: forget winning a game — what if AI agents just had to live in a society together? Would they behave like us? Hold grudges? Gossip? Make peace because they’re tired of fighting?

That became Tiny Civilization — a browser sim where 2–8 agents with distinct personalities live on a small island, gathering, building, trading, stealing, gossiping, holding grudges, making peace, and remembering it all across lives.

👉 Live demo — runs keyless in “instinct mode,” or plug in a key for LLM minds.

The whole thing — every line — was built with Claude Code, using the Fable model, right before Fable retired. It felt fitting to send a storytelling model off by having it build a world full of little stories.

The problem: pure-LLM agents are bankrupting and pure-utility agents are boring

The first design decision was the hardest. Two obvious options, both bad:

Call the LLM every tick. Every agent, every day, makes an API call. Beautiful, expressive — and it costs a fortune and crawls.

Pure utility AI (the classic RTS approach). Fast and free, but agents can’t scheme, can’t talk, can’t surprise you. It’s just min-maxing.

So I split the brain in two:

Layer Decides Cadence Cost

LLM mind Strategy (gather/build/trade/befriend/aggress/reconcile/defend), per-neighbor stances, an inner thought, and all dialogue ~every 15 sim-days ~150 calls / 1,000 days

Utility engine Each day’s concrete action — eat, sleep, gather, steal, attack, gift, trade, make peace every tick free, local

The LLM declares intent — “aggress against Kai, he raided my base” — and that biases the utility scores for the next two weeks. The body runs on instinct (hunger, energy, storms); the mind sets direction. This is the trick that makes it both affordable and alive.

Memory across lives — where it got strange

When a run ends, each agent’s life is distilled into memory lines:

  • “you won with score 200”

  • “Maya destroyed your home”

  • “you and Kai made peace after a feud”

  • “this life hardened you — you trust less now”

Stored in localStorage, keyed by agent name, and injected into next run’s prompts. Agents start referencing past lives in dialogue, pre-emptively paying reparations to remembered enemies, trusting remembered allies — sometimes to their own ruin.

How I actually built and balanced it

This is the part I’m proudest of, and it’s pure childhood-strategy-game energy: you can’t balance a society by vibes. So the workflow was:

A pure, deterministic simulation core — zero DOM, zero AI. The same runTick powers the browser, the tests, and a batch runner.

A seeded experiment runner. npm run experiment -- --runs 30 --days 1000 --seed 1 runs 30 reproducible lifetimes and spits out a win-rate/score table. Every balance change landed with a before/after table. (Example: a Hermit rebalance moved one agent from 0/30 wins to 9–11/30 without breaking the other archetypes.)

A 16-gate regression suite. The justification gate (no grievance → no violence), war burnout, reconciliation pricing, positive-sum trade, granary protection, homelessness-death, trait drift — each one locked behind a headless test so balance changes can’t silently regress behavior.

Change a dial in constants.ts → run the experiment → read the table. That was the entire loop.

What emerged (none of this is scripted)

Running the same island over and over, with memory on, produced a coherent arc:

Massacres. Early on, the warrior just killed everyone. No deterrence existed.

Forever wars. I added a justification gate (violence needs a real grievance — theft, attack, trespass). That fixed unprovoked killing… but now wars never ended: 495 fruitless attacks across 1,500 days.

Diplomacy. Reconciliation + escalating reparations + war-weariness made endings inevitable. Attacks per 2,000-day run collapsed: 594 → 14 → 0.

The kleptocracy. With war capped, theft became the unpunished crime — 340 thefts/run. I fixed it the human way: granaries. Fortification, not punishment.

The golden age. A clean-slate run, no memories: zero attacks in 1,000 days, and the Warrior won by out-trading everyone (118 trades, 1 attack).

The fall. The very next run — now remembering that golden age — collapsed. Remembered trust lowered everyone’s guard, which raised the payoff of betrayal. Scores dropped ~15%; every relationship ended negative. Peace between strangers turned out to be easier than peace between old friends with open tabs.

The recurring lesson: every time I patched one form of conflict, the agents found the next-cheapest one. Massacres → wars → theft → litigation. Exactly like us.

Stack

TypeScript, React, Zustand, Vite, Recharts. Default mind is z.ai GLM, but any OpenAI-compatible provider works per-agent — so you can literally pit Claude vs GLM vs Gemini in the same village and watch model-vs-model diplomacy. Keys never touch the browser (server-side proxy), and an adaptive-pacing controller learns each key’s real rate ceiling.

Try it: https://multiagentciv.netlify.app/

Code: https://github.com/dhrupo/multi-agent-civilization

If you played the same strategy games I did, I think you’ll feel right at home watching this thing run.

0 views
Back to Blog

Related posts

Read more »