Building InterOrdra: A Semantic Gap Detector

Published: 1 week ago (December 31, 2025 at 11:56 AM EST)

4 min read

Source: Dev.to

Week 1 – From abstract idea to deployed MVP

Hi! I’m Rosibis, an AI/ML student transitioning from Technical Support to AI Engineering. This is Week 1 of building InterOrdra, a semantic‑gap detection framework. Follow along as I document the journey.

The Problem

Have you ever explained something perfectly clear to you, only to watch the other person’s eyes glaze over? Or read documentation that technically answers your question but somehow… doesn’t?

That’s a semantic gap – and they’re everywhere:

📚 Technical docs that assume knowledge users don’t have
🤖 AI prompts that get confusing responses
🔬 Expert explanations that lose non‑experts entirely
💼 Cross‑team communication where everyone speaks “different languages”

The frustrating part? These gaps are invisible. You know something’s wrong, but you can’t point to exactly where the misunderstanding lives.

I wanted to build a tool that makes these invisible gaps visible and measurable.

The Insight

A few weeks ago, I had this recurring thought (honestly, more like an obsession):

“What if communication gaps aren’t random failures, but detectable patterns in semantic topology?”

I started seeing it geometrically – like two texts existing as point clouds in high‑dimensional space. When they understand each other, the clouds overlap. When they don’t, there are orphaned concepts floating in one space with no corresponding points in the other.

This led to a bigger vision I’m calling the Resonance Spectrometer – an instrument to detect coordinated pattern transmission across different “communication bands” (not just human language, but any system that transmits organized information).

InterOrdra is the first instrument in that spectrum: detecting semantic gaps in human text.

But I needed to start somewhere concrete. So: MVP first, philosophy second.

Technical Decisions

Stack

Python 3.11 – Fast, clean, great ML ecosystem
Sentence‑Transformers (all-MiniLM-L6-v2) – Lightweight semantic embeddings
Scikit‑learn – Clustering (DBSCAN) and similarity calculations
Streamlit – Rapid prototyping for UI (deployed in Streamlit Cloud)

1. .gitignore & Venv

# .gitignore
venv/
__pycache__/
*.pyc

git add .gitignore
git commit -m "Remove venv from tracking"
git push --force

Lesson: .gitignore is your friend. Set it up first, not after you’ve already pushed disasters.

2. Import‑Path Confusion

Streamlit Cloud uses different working‑directory assumptions than my local dev environment. My imports broke on deployment:

# Broke on Streamlit Cloud
from backend.embeddings import generate_embeddings

Fixed version

import os, sys
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
from backend.embeddings import generate_embeddings

Lesson: Always test relative imports. Better yet, structure projects as proper Python packages from day 1.

Current State

✅ What Works

Semantic similarity analysis between any two texts
Detection of “orphaned concepts” (ideas in one text with no match in the other)
Vocabulary analysis (shared vs. unique words)
3‑D interactive visualization of semantic topology
Actionable recommendations to close gaps

Deployed and public.

⚠️ Current Limitations

UI only in Spanish (English translation in progress)
Mobile experience has occasional rendering issues
Only detects similarity‑based gaps – still exploring complementarity and harmonic patterns

📊 Early Traction

Live for ~1 week
Growing organically
Waiting for first user feedback

What’s Next

Immediate (this week)

🌐 English UI toggle
📱 Mobile‑responsive fixes
📄 Export results as PDF

Short‑term (next 2–4 weeks)

Advanced gap detection – beyond similarity analysis
Analytics setup (seeing actual usage patterns)
File‑upload support (.txt, .docx, .pdf)

Medium‑term (1–3 months)

Public API (FastAPI backend)
Multi‑text comparison (analyze 3+ texts simultaneously)
Deeper semantic‑topology analysis

Try It Yourself

Curious what you’ll discover? Drop your findings in the comments or open an issue on GitHub if you run into anything interesting. Happy gap hunting!

You Spot Bugs 🐛

Reflection

This project felt different. Usually I second‑guess myself constantly. With InterOrdra, I had this weird certainty – like I was building something that needed to exist, and I was just the person who happened to notice it first.

Took 4 days from “hmm interesting idea” to “deployed MVP with users.” That’s the power of:

Starting with a concrete problem (not abstract philosophy)
Choosing boring, reliable tech
Shipping fast, iterating faster
Not letting perfect kill good

Next post: diving deeper into the semantic‑topology math and why DBSCAN + cosine similarity reveals structure that traditional NLP misses.

What do you think? Have you experienced semantic gaps in your work? How do you currently handle miscommunication between systems?

Drop a comment below – I’d love to hear your thoughts! 💬

3. Series metadata (para posts futuros)

Cuando publiques el segundo post, podés crear una serie:

Series: Building InterOrdra
Part: 1

Building in public. Learning in public. Breaking things in public.
Follow along: I’m documenting the full journey from Technical Support Engineer → AI/ML Engineer.

Building InterOrdra: A Semantic Gap Detector

Week 1 – From abstract idea to deployed MVP

The Problem

The Insight

Technical Decisions

Stack

1. .gitignore & Venv

2. Import‑Path Confusion

Current State

✅ What Works

⚠️ Current Limitations

📊 Early Traction

What’s Next

Immediate (this week)

Short‑term (next 2–4 weeks)

Medium‑term (1–3 months)

Try It Yourself

You Spot Bugs 🐛

Reflection

3. Series metadata (para posts futuros)

Related posts

Congrats to the AI Agents Intensive Course Writing Challenge Winners!

How GitHub Pull Requests in VS Code Improved My Open-Source Workflow

AI SEO agencies Nordic

How do I discover new music that actually fits my taste?

Week 1 – From abstract idea to deployed MVP

The Problem

The Insight

Technical Decisions

Stack

1. .gitignore & Venv

2. Import‑Path Confusion

Current State

✅ What Works

⚠️ Current Limitations

📊 Early Traction

What’s Next

Immediate (this week)

Short‑term (next 2–4 weeks)

Medium‑term (1–3 months)

Try It Yourself

You Spot Bugs 🐛

Reflection

3. Series metadata (para posts futuros)

Related posts

Congrats to the AI Agents Intensive Course Writing Challenge Winners!

How GitHub Pull Requests in VS Code Improved My Open-Source Workflow

AI SEO agencies Nordic

How do I discover new music that actually fits my taste?

Week 1 – From abstract idea to deployed MVP

Short‑term (next 2–4 weeks)

Medium‑term (1–3 months)