The Life of a Search Query in OpenSearch

Published: (May 1, 2026 at 09:10 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

Overview

A search request typically looks like:

GET /my-index/_search
Content-Type: application/json

{
  "query": { "match": { "title": "opensearch" } },
  "size": 10,
  "from": 0,
  "_source": true
}

The client can be anything from curl to a Python SDK; the wire format is always JSON over HTTP.

HTTP Request and Coordinating Node

  1. HTTP server – OpenSearch runs a lightweight HTTP server that parses the request.
  2. Coordinating node – The node that receives the request becomes the coordinating node. Any node in the cluster can act as the coordinator; it does not need to store data.

Sharding and Routing

  • Data is stored in shards – Lucene indexes distributed across the cluster.
  • Each document is assigned to a primary shard based on a routing value (default: the document _id). The routing formula is:
hash(routing) % number_of_primary_shards
  • The coordinating node runs this hash function to determine which primary shards are responsible for the query.
  • For a simple term query across a single index, the coordinator may need to contact all primary shards of that index. Supplying a specific routing value can dramatically reduce the shard set and improve latency.

Query Execution on Shards

Once the responsible shards are known, the coordinator forwards the query to the shard nodes (which may be the same physical node or a different one). Each shard executes the query locally against its Lucene segments.

  • Lucene stores data in immutable segments. During the query phase, each segment is searched independently.
  • OpenSearch can search segments in parallel within a shard – a feature called concurrent segment search (introduced in version 3.0). The engine automatically decides how many slices to create based on CPU cores and segment size.

Scoring (BM25)

  • For each matching document, Lucene computes a relevance score using the BM25 algorithm.
  • Key parameters: term frequency, inverse document frequency, and the length‑normalisation factor b (default 0.75).
  • Each shard returns the top‑k (default 10) documents together with their scores.

Fetch Phase

The query phase only returns document IDs and scores. If the client requested the _source field (as most do), a second round called the fetch phase runs:

  1. The coordinating node asks each shard for the full source of the selected documents.
  2. Shards retrieve the stored _source from Lucene’s stored fields and send it back.

Because the fetch phase may move larger payloads over the network, OpenSearch tries to keep the number of fetched documents small. Pagination (from/size) and stored_fields filters are important performance knobs.

Merging Results

After receiving the top‑k results from each shard, the coordinating node:

  1. Merges them into a single ranked list.
  2. Re‑applies the global size and from parameters.
  3. Sorts the combined set based on the BM25 scores returned by each shard.
  4. Applies any custom sort rules specified in the query.

The final merged list is formatted as a JSON response and sent back to the client.

Near‑Real‑Time Indexing

While a query is being processed, OpenSearch maintains a near‑real‑time view of the data:

  • New documents are first written to an in‑memory buffer and appended to the translog for durability.
  • Every second (default index.refresh_interval), the buffer is flushed to a new Lucene segment, making the freshly indexed documents searchable.
  • This results in a typical < 1‑second lag between indexing and visibility in search results.

Common Issues and Mitigations

IssueWhy it HappensMitigation
Slow query latencyToo many shards queried, high segment countUse routing, configure index.routing_partition_size, force‑merge to reduce segments
High CPU usageConcurrent segment search on large shardsTune search.max_concurrent_shard_requests and search.max_concurrent_segments
Stale resultsRefresh interval too large for real‑time needsReduce index.refresh_interval on hot indices
Large payloadsFetching full _source for many docsUse stored_fields or docvalue_fields, limit size

Conclusion

A search query in OpenSearch is more than a simple HTTP call. It involves routing, parallel shard execution, scoring, optional fetching, and a final merge step that stitches everything together. Understanding each stage helps you design better schemas, tune performance, and avoid common pitfalls such as unnecessary shard scans or excessive refresh intervals. By visualising the journey of a query, you gain the confidence to diagnose latency issues, choose the right indexing strategies, and make the most of OpenSearch’s powerful plugin and analysis ecosystems.

0 views
Back to Blog

Related posts

Read more »