Understanding ReAct (Reasoning + Action) through a simple game

Published: (February 4, 2026 at 12:16 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

When I first moved from New York City to Philadelphia, I was starved for social interaction. So naturally, I did what any adult would do: I downloaded Meetup. My first social outing was a writers’ meetup, where the ice‑breaker was a childhood game called “Pass the Story.”

The “Pass the Story” Game

  1. A sheet of paper is passed around.
  2. One person writes a line, then passes the paper to the next person.
  3. Each new participant reads what’s been written, adds the next line, and passes it on.
  4. The process continues until the group has collectively created a story.

Mapping the Game to ReAct

ReAct (Observing + Reasoning + Acting) can be illustrated with the steps of the game:

ReAct ComponentGame Action
ObserveThe person reads the story so far.
ReasonThe person thinks about how to continue it.
ActThe person writes the next line and passes the paper.

The first line on the page is the query. Each subsequent participant repeats the Observe → Reason → Act loop until the story is complete.

Statelessness of Language Models

Just like the next player has no knowledge of the story until they read the paper, LLMs are stateless: they don’t retain memory between API calls or conversation turns. Each request starts fresh—much like the character Ghajini in the Bollywood film of the same name.

Passing the full story forward is analogous to sending the conversation history back to the LLM in the next API call. The accumulated story becomes the context for the model.

Limitations and Practical Solutions

Infinite Looping

If a participant (or a model) has a very limited “vocabulary,” they may keep repeating the same line:

“Cats and dogs are animals. Animals are cats and dogs.”

In AI this manifests as infinite looping—the observation provides no new information, the reasoning repeats, and the same action is taken over and over.

Solution: Impose a hard limit on the number of iterations (e.g., “only allow 5 iterations”).

Context Window Constraints

When the story grows large (e.g., 100 pages), later participants can’t realistically read everything. LLMs face a similar issue: each Observe → Reason → Act loop adds more tokens, which:

  • Increases cost
  • Slows response time
  • Risks hitting the model’s context limit

Solution: Context pruning – instead of sending the entire history each time, keep only the most recent steps and maintain a high‑level summary of earlier steps.

ReAct vs. ReWOO

  • ReAct: Think after every step (improvisational).
  • ReWOO: Think deeply once, generate a structured plan, then execute (strategic).

It’s the difference between:

  • Improvising a story line by line, versus
  • Agreeing on the plot first and then writing chapters.

Both approaches are useful: one is reactive, the other is strategic.

Opinion

I believe ReWOO is what the IBM video referred to when it said “2026 is the age for multiple agents.” By the end of this year, this blog might already be outdated, but I hope it offered a clear insight into how the ReAct loop works and encourages you to share your thoughts.


Thanks to the awesome Philly writers’ group for welcoming me!

Reddit | LLMs

Back to Blog

Related posts

Read more »