A (very) Short Look at How MySQL Joins Work
Source: Dev.to
How MySQL Joins Work
MySQL evaluates joins using a Nested‑Loop Join (NLJ) algorithm. In simple terms, it takes one row from the first table, scans the next table for matching rows, and repeats this process for additional tables. In the worst case this yields O(n²) time complexity, but built‑in optimizations keep real‑world performance much lower.
Example Join
SELECT *
FROM t1
JOIN t2 ON t1.t2_id = t2.id
JOIN t3 ON t2.t3_id = t3.id;
Pseudocode for the NLJ Algorithm
for each row in t1
for each t2 matching t1.t2_id
for each t3 matching t2.t3_id
return row if join condition is met
Each inner loop runs for every row of its outer loop, so the cost can grow quickly with large tables, especially when there are no WHERE clauses or indexes.
Condition Pushdown
MySQL applies condition pushdown, meaning that relevant parts of the WHERE clause are evaluated at each loop level rather than after all joins are performed.
SELECT *
FROM t1
JOIN t2 ON t1.t2_id = t2.id
JOIN t3 ON t2.t3_id = t3.id
WHERE t2.status = 'active';
In this query MySQL filters t2.status = 'active' while looping through t2, passing only the qualifying rows to the t3 check. This avoids joining unnecessary rows.
Optimization Tips
- Add selective indexes on columns used in join conditions and
WHEREclauses. Indexes allow MySQL to skip irrelevant rows, dramatically reducing the amount of scanning required. - Plan for growth: As tables become larger, even well‑indexed joins can become costly. Consider:
- Reducing the need for complex joins by denormalizing where appropriate.
- Partitioning large tables.
- Reviewing query patterns and refactoring them for efficiency.
A little foresight in schema design and indexing can save a lot of performance pain later on.