I trained a neural network to break my own encrypted search. It learned nothing.

Published: 2 days ago (June 11, 2026 at 07:27 PM EDT)

4 min read

Source: Dev.to

A few months ago I built a way to search documents by meaning while keeping the embeddings hidden — even from the server doing the search. I called it ZATRON. The obvious question everyone (including me) kept asking was: does it actually hide anything, or does it just look scrambled? Scrambled-looking isn’t the same as secure. So instead of trusting a correlation number, I did the thing that actually scares me: I trained a neural network to break it. This post is the honest write-up — including the part where I tried hard to make the attack win. Standard semantic search stores embeddings as plain vectors. Anyone with database access can cluster them by topic and infer content without reading a word. ZATRON transforms each embedding into a modular barcode: project onto PCA channels, quantize, add a per-document keyed mask, and keep only residues modulo a set of primes. You compare barcodes in modular space; the original embedding is never reconstructed. Retrieval still works — 98% of cosine quality on 626K MSMARCO passages. The question is whether the barcodes leak. My first security check was a Spearman correlation between barcode distance and true similarity. It came out near zero (ρ ≈ 0.05). Good — but a low linear correlation only rules out a simple attacker. A neural network doesn’t need linearity. It can learn whatever structure is there. So the real test: give a neural network every advantage and see if it can recover similarity from the barcodes. I used a known-plaintext attacker — the strongest realistic setting: It sees all the stored barcodes. It also gets 80,000 document pairs with their true cosine similarities (as if a chunk of plaintext leaked). It trains a model — a linear probe and a 3-layer MLP — to predict the similarity of unseen pairs from per-prime circular-difference features. Train and test pairs share no anchor documents, so it can’t just memorize. And the part that makes the result trustworthy: I ran the identical attack on the unprotected quantized signals as a control. If the attack can’t break those, the attack is too weak and the test means nothing. On 50,000 MSMARCO passages, 100,000 labeled pairs:

Input the attacker sees Linear probe MLP (3-layer)

Unprotected signals (control) ρ = 0.79, AUC = 0.985 ρ = 0.90, AUC = 0.999

ZATRON barcodes ρ = 0.00, AUC = 0.498 ρ = 0.00, AUC = 0.505

The same network that recovers similarity from unprotected signals almost perfectly (AUC 0.999) gets exactly chance level on the barcodes — with 80,000 labeled pairs to learn from. AUC 0.50 is a coin flip. It learned nothing. “8x faster than FHE” is a weak flex — everyone knows FHE is slow. The fairer comparison is ASPE (Wong et al., SIGMOD 2009), the classic encrypted-kNN scheme. ASPE preserves scalar products exactly, so retrieval is perfect — but that same property means any observer can read similarities straight off the ciphertexts.

ASPE (SIGMOD ‘09) ZATRON

Retrieval recall@10 (strict) 100% 81%

Observer reads similarity directly ρ = +0.87 ρ = −0.06

Learned attack (MLP) ρ = +0.91, AUC = 0.99 ρ = +0.01, AUC = 0.52

ASPE buys perfect recall with total leakage. ZATRON gives up a margin on the strictest retrieval metric and leaks nothing — to a direct observer or a trained network. Honesty is the whole point, so the limits: This is the observer threat model. A key holder computing many pairwise distances can still partially recover geometry via MDS (ρ ≈ 0.35) — that’s inherent to any distance-preserving scheme, FHE included. It is a randomized privacy-preserving encoding, not a reversible cipher, and not yet independently audited by a cryptographer. That’s the right bar before anyone calls it production-grade. The strict recall metric here (full top-10 set overlap) is harder than the top-1-in-top-10 number I quote elsewhere. Same system, stricter ruler. Everything is reproducible: pip install zatron

The attack and the ASPE comparison are in the repo as runnable scripts (benchmarks/). If you can make the neural attack win — train it longer, give it more pairs, better features — I genuinely want to see it. Finding the weakness is the point. Code + benchmarks: https://github.com/zahraarmantech/ZATRON

Live demo: https://huggingface.co/spaces/zahraarman/ZATRON

I’d rather have someone break this now than after I’ve claimed too much.

I trained a neural network to break my own encrypted search. It learned nothing.

Related posts

Launching BonVoyage: From Travel Problem to Public Launch

The spec is in the wrong place

Incident Automation: What to Automate, What to Leave to Humans

The Heuristics Say Don't