The Life of a Search Query in OpenSearch
Source: Dev.to
Overview
A search request typically looks like:
GET /my-index/_search
Content-Type: application/json
{
"query": { "match": { "title": "opensearch" } },
"size": 10,
"from": 0,
"_source": true
}
The client can be anything from curl to a Python SDK; the wire format is always JSON over HTTP.
HTTP Request and Coordinating Node
- HTTP server – OpenSearch runs a lightweight HTTP server that parses the request.
- Coordinating node – The node that receives the request becomes the coordinating node. Any node in the cluster can act as the coordinator; it does not need to store data.
Sharding and Routing
- Data is stored in shards – Lucene indexes distributed across the cluster.
- Each document is assigned to a primary shard based on a routing value (default: the document
_id). The routing formula is:
hash(routing) % number_of_primary_shards
- The coordinating node runs this hash function to determine which primary shards are responsible for the query.
- For a simple term query across a single index, the coordinator may need to contact all primary shards of that index. Supplying a specific routing value can dramatically reduce the shard set and improve latency.
Query Execution on Shards
Once the responsible shards are known, the coordinator forwards the query to the shard nodes (which may be the same physical node or a different one). Each shard executes the query locally against its Lucene segments.
Segment Search
- Lucene stores data in immutable segments. During the query phase, each segment is searched independently.
- OpenSearch can search segments in parallel within a shard – a feature called concurrent segment search (introduced in version 3.0). The engine automatically decides how many slices to create based on CPU cores and segment size.
Scoring (BM25)
- For each matching document, Lucene computes a relevance score using the BM25 algorithm.
- Key parameters: term frequency, inverse document frequency, and the length‑normalisation factor
b(default 0.75). - Each shard returns the top‑k (default 10) documents together with their scores.
Fetch Phase
The query phase only returns document IDs and scores. If the client requested the _source field (as most do), a second round called the fetch phase runs:
- The coordinating node asks each shard for the full source of the selected documents.
- Shards retrieve the stored
_sourcefrom Lucene’s stored fields and send it back.
Because the fetch phase may move larger payloads over the network, OpenSearch tries to keep the number of fetched documents small. Pagination (from/size) and stored_fields filters are important performance knobs.
Merging Results
After receiving the top‑k results from each shard, the coordinating node:
- Merges them into a single ranked list.
- Re‑applies the global
sizeandfromparameters. - Sorts the combined set based on the BM25 scores returned by each shard.
- Applies any custom sort rules specified in the query.
The final merged list is formatted as a JSON response and sent back to the client.
Near‑Real‑Time Indexing
While a query is being processed, OpenSearch maintains a near‑real‑time view of the data:
- New documents are first written to an in‑memory buffer and appended to the translog for durability.
- Every second (default
index.refresh_interval), the buffer is flushed to a new Lucene segment, making the freshly indexed documents searchable. - This results in a typical < 1‑second lag between indexing and visibility in search results.
Common Issues and Mitigations
| Issue | Why it Happens | Mitigation |
|---|---|---|
| Slow query latency | Too many shards queried, high segment count | Use routing, configure index.routing_partition_size, force‑merge to reduce segments |
| High CPU usage | Concurrent segment search on large shards | Tune search.max_concurrent_shard_requests and search.max_concurrent_segments |
| Stale results | Refresh interval too large for real‑time needs | Reduce index.refresh_interval on hot indices |
| Large payloads | Fetching full _source for many docs | Use stored_fields or docvalue_fields, limit size |
Conclusion
A search query in OpenSearch is more than a simple HTTP call. It involves routing, parallel shard execution, scoring, optional fetching, and a final merge step that stitches everything together. Understanding each stage helps you design better schemas, tune performance, and avoid common pitfalls such as unnecessary shard scans or excessive refresh intervals. By visualising the journey of a query, you gain the confidence to diagnose latency issues, choose the right indexing strategies, and make the most of OpenSearch’s powerful plugin and analysis ecosystems.