Solving Latency and Pagination in Image and Keyword Based Property Search
Source: Dev.to
Ownership and Approach
As a senior software engineer, I took ownership of fixing both the performance and correctness of Deep Search.
One key decision was to keep the system deterministic and avoid using LLMs for ranking or retrieval. LLMs are useful, but they are nondeterministic and hard to control at scale.
I used an LLM only to understand user intent. The LLM parses the user query and extracts:
- Hard filters like number of bedrooms or location
- A flag that indicates whether image‑based search is required
Examples
- “2 bedroom apartment in Manhattan” → only deterministic filters and keyword search
- “2 bedroom house with backyard having large trees” → requires visual understanding and triggers Deep Search
This limited LLM usage, making the rest of the system predictable and debuggable.
Challenges Found During Implementation
While redesigning the system, I discovered a major issue with how results from different retrieval systems were merged.
- BM25 search has its own ranking and pagination.
- Vector search also has its own ranking and pagination.
When results were paginated first and then merged, pagination broke completely. Page 2 from BM25 and Page 2 from vector search did not represent the same set of results. Some pages contained mostly vector matches, some had none, and rankings changed between requests. This caused unstable and inconsistent results, which is unacceptable for a production search system.
The problem required rethinking how ranking and pagination were handled.
Solution
When Deep Search is triggered, I built a hybrid search pipeline with clear separation of concerns:
- Hard deterministic filters
- BM25 full‑text search using RedisSearch
- Vector search on property images using pgvector
The key change is that none of these systems paginate independently anymore.
Merging results
I implemented Reciprocal Rank Fusion to combine ranked lists from different search engines. For each property, a single hybrid score is computed:
- BM25 rank is converted into a reciprocal score
- Vector similarity scores are normalized
Weights are applied based on whether Deep Search is triggered:
# Hybrid scoring formula
hybridScore = alphaBM25 * bm25Score + betaVector * (vecScore / maxVec)
Only after this unified hybrid score is computed do we apply pagination. This guarantees stable ordering across pages.
Result
The new approach solved both performance and ranking issues:
- Latency reduced from ~5 minutes to under 10 seconds
- Pagination is stable and deterministic
- Visual matches surface naturally when users request visual features
- Keyword intent still influences ranking
- Results are consistent across pages