AI Matching: Matrix First, Neural Nets Later
Source: Dev.to
1. The Real Business Problem: “Where Do We Get Data to Train a Neural Network?”
Let’s start with the problem most teams avoid articulating clearly.
Neural networks do not fail because they are bad.
They fail because they need data that doesn’t exist yet.
To train a meaningful matching model you need:
- Historical matches
- Outcomes (success/failure)
- User behavior (clicks, acceptances, conversions)
- Enough volume to avoid over‑fitting
Early‑stage systems have none of that. This creates a paradox:
- You need good matching to get users.
- You need users to get data.
- You need data to train matching.
Most teams quietly ignore this and ship:
- Random relevance
- Over‑confident AI labels
- Brittle rule engines disguised as “ML”
That’s not a technical issue; it’s a product and architecture problem.
2. A Concrete Use Case: Choosing the Right Marketing Channel or Agency
Scenario: A company is launching a new marketing campaign and wants to choose the right advertising channel, agency, or influencer.
Constraints are realistic:
- Limited budget
- Brand reputation at stake
- Unclear expectations about what will work
There is no historical performance data for this exact setup.
Supply‑side attributes (channels, agencies, influencers):
- Different levels of reach
- Different credibility
- Different risk profiles
- Different communication styles
The business question is not:
“Which option is statistically similar to this campaign?”
The real question is:
“Which option best fits the expectations and constraints of this campaign?”
That’s a compatibility problem, not a similarity problem.
3. Why “Just Train a Neural Network” Doesn’t Work Here
At this point, someone usually says:
“Let’s just embed everything and train a model later.”
That works only if you already have:
- Outcomes
- Labels
- Scale
In our use case, you don’t. Trying to use neural networks leads to one of three failures:
- Over‑fitting on tiny data – the model outputs noise that looks confident.
- Model disabled “temporarily” → permanently – the team loses trust.
- No prior understanding of what “fit” means – the system is blind.
The real issue isn’t a lack of ML talent; it’s that the system has no prior notion of “fit.” You need a prior.
4. Reframing the Problem: Similarity vs. Compatibility
This is the key conceptual shift.
Most ML tooling is built around similarity:
- Cosine similarity
- Euclidean distance
- Nearest‑neighbors
Similarity answers:
“How alike are these two things?”
But business matching rarely asks that. Instead it asks:
“How appropriate is this option for this context?”
That’s compatibility—which is:
- Asymmetric
- Expectation‑driven
- Domain‑specific
And it can be expressed explicitly, without pretending to learn it from non‑existent data.
5. Solution: Compatibility Matrix (Feature Matrix, Not ML)
Instead of trying to learn relevance, encode domain knowledge as a matrix.
Define two small, stable feature spaces.
Campaign side
blog_type ∈ { corporate, brand_voice, expert, personal }
Captures:
- Formality of communication
- Expected authority level
- Acceptable personal storytelling
Supply side (agency / influencer / channel)
social_status ∈ { celebrity, macro, micro, nano }
Captures:
- Perceived authority
- Reach expectations
- Risk tolerance
- Credibility
Now define a compatibility matrix:
compatibility[blog_type][social_status] → score ∈ [0 … 1]
The matrix answers:
“Given this campaign style, how appropriate is this level of authority?”
It is not a guess; it is a product hypothesis.
6. Example: A Simple 4 × 4 Compatibility Matrix
| celebrity | macro | micro | nano
---------------|-----------|-------|-------|------
corporate | 1.0 | 0.8 | 0.4 | 0.2
brand_voice | 0.7 | 1.0 | 0.8 | 0.5
expert | 0.6 | 0.9 | 1.0 | 0.7
personal | 0.3 | 0.6 | 0.9 | 1.0
# Compatibility Matrix lookup (Day‑1 matching)
matrix = {
'corporate': [1.0, 0.8, 0.4, 0.2],
'brand_voice': [0.7, 1.0, 0.8, 0.5],
'expert': [0.6, 0.9, 1.0, 0.7],
'personal': [0.3, 0.6, 0.9, 1.0]
}
def matrix_score(campaign_type: str, influencer_status: str) -> float:
"""O(1) lookup — thousands of RPS without trouble."""
statuses = ['celebrity', 'macro', 'micro', 'nano']
idx = statuses.index(influencer_status)
return matrix[campaign_type][idx]
# Production usage
score = matrix_score('corporate', 'macro') # 0.8 ✅
print(f"Corporate ↔ Macro: {score}")
What this represents in business terms
- Corporate campaigns prioritize authority and low risk.
- Personal storytelling thrives with relatable, smaller voices.
- Expert campaigns value credibility over raw reach.
Important clarifications
- The numbers are relative, not absolute.
- They don’t predict success; they define expected fit, not outcomes.
7. Why This Works Without Data
“Isn’t this just hard‑coded logic?”
Yes — and that’s exactly the point.
When you have no data, the only reliable source of truth is human expertise. By turning that expertise into a transparent matrix:
- You get day‑one relevance – the system can make sensible recommendations immediately.
- You avoid over‑fitting – there’s no model to over‑fit on tiny data.
- You create a feedback loop – as real outcomes arrive, you can adjust the matrix or gradually introduce ML components.
In other words, the matrix is a baseline that can evolve into a data‑driven model once you have enough signal.
8. From Matrix to Machine Learning (When Data Arrives)
When you start collecting outcomes, you can:
- Validate the matrix against real conversion rates.
- Calibrate the scores (e.g., via logistic regression) while keeping the same feature space.
- Hybridize – use the matrix as a strong prior in a Bayesian model or as a feature in a downstream learner.
The transition is smooth because the feature definitions stay the same; only the way you compute the final relevance score changes.
9. Takeaways
- Don’t wait for data to start delivering relevance.
- Encode domain knowledge explicitly as a compatibility matrix.
- Treat the matrix as a product hypothesis, not a final prediction.
- Use it as a launchpad for a future data‑driven model.
By reframing the problem from “similarity” to “compatibility” and leveraging human expertise up front, you can ship a useful matching engine from day one—no big data required.
Why a Compatibility Matrix?
A compatibility matrix is structured, graded, and explicit, unlike:
- Binary rules
if/elsechains- “Fake” ML models
A compatibility matrix gives you:
- Deterministic behavior
- Explainable decisions
- Controllable bias
- Stable early relevance
Most importantly, it gives the system a worldview before any data exists.
8. How This Evolves Into Machine Learning (Without Rewrites)
This approach is not anti‑ML – it’s pre‑ML.
As the system runs, you naturally collect:
- Which matches were shortlisted
- Which were accepted
- Which led to engagement or conversion
At that point, the transition is incremental.
Phase 1 — Matrix Only
score = compatibility_matrix[blog_type][social_status]
Phase 2 — Hybrid
# Phase 2: Matrix 70% + NN 30%
matrix_score = 0.8
nn_score = nn_model.predict(features) # 0.75
final_score = 0.7 * matrix_score + 0.3 * nn_score # 0.785
Phase 3 — ML‑Dominant
score = nn_prediction
The matrix never disappears; it becomes:
- A baseline
- A regularizer
- A fallback for cold‑start
9. Why This Gives You Day‑One Relevance
The biggest hidden risk in matching systems is irrelevance at launch.
If users see poor matches:
- They don’t interact
- You don’t collect data
- Your ML roadmap dies before it starts
A compatibility matrix avoids that trap. You get:
- Reasonable defaults
- Behavior aligned with business expectations
- Trust from users
- Data that actually reflects intent
All without pretending you have Big Data.
# Day 1: 100% matrix, no training data needed
def get_matches(request, suppliers, min_score=0.6):
matches = []
for supplier in suppliers:
score = matrix_score(request.campaign_type, supplier.category)
if score >= min_score:
matches.append((supplier, score))
# Sort by score descending and return top 14
return sorted(matches, key=lambda x: x[1], reverse=True)[:14]
# Real metrics: 47 suppliers → 12 matches → 3 % conversion
# O(n) complexity, thousands of RPS, zero cold start
Final Takeaway
If there’s one idea worth remembering:
Similarity is a mathematical concept.
Compatibility is a business concept.
Neural networks excel at learning similarity after the world gives you data.
Compatibility matrices let you act before that moment arrives.
- Matrix first.
- Neural nets later.
That’s not a compromise; it’s how real matching systems survive long enough to learn.