Building InterOrdra: A Semantic Gap Detector

Published: (December 31, 2025 at 11:56 AM EST)
4 min read
Source: Dev.to

Source: Dev.to

Week 1 – From abstract idea to deployed MVP

Hi! I’m Rosibis, an AI/ML student transitioning from Technical Support to AI Engineering. This is Week 1 of building InterOrdra, a semantic‑gap detection framework. Follow along as I document the journey.

The Problem

Have you ever explained something perfectly clear to you, only to watch the other person’s eyes glaze over? Or read documentation that technically answers your question but somehow… doesn’t?

That’s a semantic gap – and they’re everywhere:

  • 📚 Technical docs that assume knowledge users don’t have
  • 🤖 AI prompts that get confusing responses
  • 🔬 Expert explanations that lose non‑experts entirely
  • 💼 Cross‑team communication where everyone speaks “different languages”

The frustrating part? These gaps are invisible. You know something’s wrong, but you can’t point to exactly where the misunderstanding lives.

I wanted to build a tool that makes these invisible gaps visible and measurable.

The Insight

A few weeks ago, I had this recurring thought (honestly, more like an obsession):

“What if communication gaps aren’t random failures, but detectable patterns in semantic topology?”

I started seeing it geometrically – like two texts existing as point clouds in high‑dimensional space. When they understand each other, the clouds overlap. When they don’t, there are orphaned concepts floating in one space with no corresponding points in the other.

This led to a bigger vision I’m calling the Resonance Spectrometer – an instrument to detect coordinated pattern transmission across different “communication bands” (not just human language, but any system that transmits organized information).

InterOrdra is the first instrument in that spectrum: detecting semantic gaps in human text.

But I needed to start somewhere concrete. So: MVP first, philosophy second.

Technical Decisions

Stack

  • Python 3.11 – Fast, clean, great ML ecosystem
  • Sentence‑Transformers (all-MiniLM-L6-v2) – Lightweight semantic embeddings
  • Scikit‑learn – Clustering (DBSCAN) and similarity calculations
  • Streamlit – Rapid prototyping for UI (deployed in Streamlit Cloud)

1. .gitignore & Venv

# .gitignore
venv/
__pycache__/
*.pyc
git add .gitignore
git commit -m "Remove venv from tracking"
git push --force

Lesson: .gitignore is your friend. Set it up first, not after you’ve already pushed disasters.

2. Import‑Path Confusion

Streamlit Cloud uses different working‑directory assumptions than my local dev environment. My imports broke on deployment:

# Broke on Streamlit Cloud
from backend.embeddings import generate_embeddings

Fixed version

import os, sys
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
from backend.embeddings import generate_embeddings

Lesson: Always test relative imports. Better yet, structure projects as proper Python packages from day 1.

Current State

✅ What Works

  • Semantic similarity analysis between any two texts
  • Detection of “orphaned concepts” (ideas in one text with no match in the other)
  • Vocabulary analysis (shared vs. unique words)
  • 3‑D interactive visualization of semantic topology
  • Actionable recommendations to close gaps

Deployed and public.

⚠️ Current Limitations

  • UI only in Spanish (English translation in progress)
  • Mobile experience has occasional rendering issues
  • Only detects similarity‑based gaps – still exploring complementarity and harmonic patterns

📊 Early Traction

  • Live for ~1 week
  • Growing organically
  • Waiting for first user feedback

What’s Next

Immediate (this week)

  • 🌐 English UI toggle
  • 📱 Mobile‑responsive fixes
  • 📄 Export results as PDF

Short‑term (next 2–4 weeks)

  • Advanced gap detection – beyond similarity analysis
  • Analytics setup (seeing actual usage patterns)
  • File‑upload support (.txt, .docx, .pdf)

Medium‑term (1–3 months)

  • Public API (FastAPI backend)
  • Multi‑text comparison (analyze 3+ texts simultaneously)
  • Deeper semantic‑topology analysis

Try It Yourself

Curious what you’ll discover? Drop your findings in the comments or open an issue on GitHub if you run into anything interesting. Happy gap hunting!

You Spot Bugs 🐛

Reflection

This project felt different. Usually I second‑guess myself constantly. With InterOrdra, I had this weird certainty – like I was building something that needed to exist, and I was just the person who happened to notice it first.

Took 4 days from “hmm interesting idea” to “deployed MVP with users.” That’s the power of:

  • Starting with a concrete problem (not abstract philosophy)
  • Choosing boring, reliable tech
  • Shipping fast, iterating faster
  • Not letting perfect kill good

Next post: diving deeper into the semantic‑topology math and why DBSCAN + cosine similarity reveals structure that traditional NLP misses.

What do you think? Have you experienced semantic gaps in your work? How do you currently handle miscommunication between systems?

Drop a comment below – I’d love to hear your thoughts! 💬

3. Series metadata (para posts futuros)

Cuando publiques el segundo post, podés crear una serie:

Series: Building InterOrdra
Part: 1

Building in public. Learning in public. Breaking things in public.
Follow along: I’m documenting the full journey from Technical Support Engineer → AI/ML Engineer.

Back to Blog

Related posts

Read more »

AI SEO agencies Nordic

!Cover image for AI SEO agencies Nordichttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads...