A solver for Semantle

Published: 3 days ago (February 18, 2026 at 02:25 PM EST)

7 min read

Source: Hacker News

Source: Hacker News

Semantle – A Semantic Wordle Variant

Semantle is a Wordle‑style game that scores guesses based on semantic similarity rather than lexical similarity.

Below is a screenshot from a recent game, ordered by similarity to the correct answer:

Screenshot of a Semantle game

My Play‑through

1st guess: philosophy – similarity 6.02 (very far).
8th guess: biology – similarity 27.55, nudging me toward science‑related terms.
After a few more attempts I realized the answer was related to a hospital setting.
52nd guess: medical – the correct word.

That’s a pretty good round for me; I’ve had games that lasted more than twice as many guesses before I gave up.

If you’ve tried Semantle, you’ll agree it’s hard, but it’s solvable by:

Gradually honing in on words that yield higher similarity scores.
Steering away from words that give lower scores.

A Faster, Algorithmic Approach

Ethan Jantz and I wondered whether we could do better algorithmically. This post describes a simple solver we built while at the Recurse Center. It reliably finds the answer in around three guesses.

What information does the game give you?

Semantle uses word embeddings—numerical vector representations of word meanings—to represent words. Specifically, it relies on the Google News word2vec model, which encodes each word as a 300‑dimensional vector.

For each guess, the game computes the cosine similarity between:
- the embedding of your guess (ggg), and
- the embedding of the target word (ttt).
This cosine similarity value is the feedback you receive after each guess.

Why is it hard?

A single cosine similarity tells you only how close your guess is to the target (i.e., “hot” vs. “cold”).
It gives no directional information about where the target lies in semantic space.

Consequently, you must combine feedback from multiple guesses and mentally triangulate the answer within the high‑dimensional embedding space.

Can we solve for the embedding of the target word?

If you want to skip ahead to how we implemented the solver, you can jump to the next section.
But first, let’s digress briefly to discuss why solving for the target word directly isn’t practical.

Why a direct solution is infeasible

A natural first idea is to treat each guess as a clue to the hidden vector and try to combine those clues to recover it directly. In embedding space, that translates to solving for the target vector using a system of linear equations.

The similarity score the game returns with each guess is

[ \text{similarity} = \cos(\theta) = \frac{g \cdot t}{|g|,|t|} ]

If we assume embeddings are normalized, then (|g| = |t| = 1) and similarity reduces to the dot product:

[ \text{similarity} = g \cdot t ]

For each guess we obtain one linear equation involving all 300 unknown components of the target embedding (\mathbf{t}).

How many equations do we need?

In general, we need at least as many independent equations as unknowns to pin down a unique solution to a linear system.
Therefore, we would need at least 300 independent guesses before we could recover (\mathbf{t}) and then look up the nearest word in the vocabulary.

Practical implications

Semantle is a hard game, but 300 guesses isn’t good enough to beat playing it as a human (or at least, I’d like to think so).

Consequently, instead of trying to solve for the target embedding directly, we:

Exploit the geometry of cosine similarity – the angle between vectors carries useful information.
Apply a filtering approach – iteratively narrow down the candidate set based on each guess’s similarity score.

This method turned out to be surprisingly effective and far more practical than solving a 300‑dimensional linear system.

How We Built the Solver

Geometrically, when we guess a word and receive a similarity score, the target word must lie somewhere on a surface of constant cosine similarity to the guess. On the unit sphere of embeddings, this surface corresponds to a ring of points that make the same angle with the guess.

Cosine‑similarity ring on the unit‑sphere of embeddings
The true target must lie somewhere on this ring.

Each guess therefore acts as a very strong filter: only words whose similarity to our guess matches the returned score can still be the target. We can exploit this fact to build a solver.

The Elimination Approach

The solver works as follows:

Initialize a list of all possible target words.
- Initially this list contains every word in the embedding vocabulary for GoogleNews‑vectors‑negative300.
- (In theory we could restrict it to the 5 000 most popular English words that Semantle actually uses, but the solver works fine even with several million candidates.)
Run a clone of Semantle that uses the same embeddings so we can obtain similarity scores instantly.

Iterate until only one candidate remains:

Step	Action
1️⃣	Pick a random candidate word as a guess.
2️⃣	Ask Semantle for the similarity score between the guess and the hidden target.
3️⃣	Compute the similarity between the guess and every remaining candidate word.
4️⃣	Keep only those candidates whose similarity (within a small tolerance) matches the reported score.
🔁	Repeat steps 1‑4.

Note 1: The implementation actually uses cosine distance (which is 1 – cosine similarity), but the logic is identical.
Note 2: Random guessing is simple; more sophisticated strategies could choose guesses that maximise information gain.

Visualisation

After each guess the set of possible targets is constrained to a ring on the unit sphere. The intersection of successive rings quickly shrinks the candidate set.

Two guesses constraining the candidate set on the embedding sphere
Each guess narrows the viable candidates. After the first (blue) guess many words remain; after the second (purple) guess the intersection leaves only a few.

Core Filtering Code

import random

tolerance = 1e-5          # allowable deviation from the reported score
potential_words = all_words  # initialise with the full vocabulary

while len(potential_words) > 1:
    # 1️⃣  Choose a guess
    guess = random.choice(potential_words)

    # 2️⃣  Get the similarity score from the game
    target_score = get_similarity_from_game(guess)

    # 3️⃣  Compute similarities to every remaining candidate
    #      word_vectors.distances returns cosine distance, so we convert
    distances = word_vectors.distances(guess, other_words=potential_words)
    similarities = 1.0 - distances   # cosine similarity = 1 – cosine distance

    # 4️⃣  Keep only words whose similarity matches the reported score
    new_potentials = [
        w for w, s in zip(potential_words, similarities)
        if abs(s - target_score) < tolerance
    ]

    potential_words = new_potentials

# The remaining word is the answer
answer = potential_words[0]

The loop repeatedly narrows the candidate list until a single word remains, which is then returned as the solution.

Why Does This Work So Quickly?

Although the word‑embedding space is 300‑dimensional, the vocabulary itself is sparse within that space. Consequently, each cosine‑similarity constraint is highly restrictive. After just one or two guesses, the set of words that lie at the right distance shrinks dramatically, making the filtering strategy extremely effective.

Example Run

We filter within a tolerance of 0.0001, which corresponds to the four‑decimal‑place precision of real Semantle’s cosine‑similarity scores.

── Guess 1: countryside ──
   cosine similarity: 0.023168  (±0.0001)
   candidates searched: 3,000,000 → remaining: 3,296

── Guess 2: levelization ──
   cosine similarity: 0.097055  (±0.0001)
   candidates searched: 3,296 → remaining: 3

── Guess 3: Skrzynski ──
   cosine similarity: 0.005881  (±0.0001)
   candidates searched: 3 → remaining: 1

Answer: **medical**

What This Shows

The guesses don’t need to “trend” toward the correct answer the way a human player’s guesses would.
“Countryside,” “levelization,” and “Skrzynski” have nothing to do with medical.
The solver isn’t walking a recognizable path.
Both humans and the solver are “honing in” on the answer, but they do so in very different ways:
- Human intuition – akin to gradient descent: each guess nudges us locally toward the target, guided by meaning.
- Algorithmic pruning – a global approach: each guess makes huge, exact jumps, slicing away vast swaths of impossible words until only the answer remains.

The same embedding space can therefore be navigated by meaning or carved up by geometry, yet both routes converge on the same solution. Pretty cool, right?