Atlas Search 评分细节（BM25 计算）

发布: 5天前 (2025年12月20日 GMT+8 06:05)

6 min read

Source: Dev.to

请提供您希望翻译的具体文本内容，我会按照要求将其翻译成简体中文并保留原有的格式、Markdown 语法以及技术术语。谢谢！

重温 “使用 MongoDB（BM25 TF‑IDF）和 PostgreSQL 的文本搜索”

在十月的时候，MongoDB 的 Franck Pachot（你的工作很棒！）发表了一篇文章，比较 MongoDB 与 PostgreSQL 的文本搜索（使用内置的 tsvector 和 ParadeDB 的 pg_search 扩展）。我不会重新概述整篇文章，但关键要点是 MongoDB 的表现完全符合预期，返回的 BM25 分数与理论计算相匹配。

分数细节

MongoDB Atlas Score Details 文档（https://www.mongodb.com/docs/atlas/atlas-search/score/get-details?utm_campaign=devrel&utm_source=third-party-content&utm_term=franck_pachot&utm_medium=devto&utm_content=scoreDetails）解释了分数是如何计算的。下面是我使用的测试案例（与我之前的博客文章中使用的相同）。

测试数据

db.articles.drop();
db.articles.deleteMany({});

db.articles.insertMany([
  { description: "🍏 🍌 🍊" },                     // short, 1 🍏
  { description: "🍎 🍌 🍊" },                     // short, 1 🍎
  { description: "🍎 🍌 🍊 🍎" },                  // larger, 2 🍎
  { description: "🍎 🍌 🍊 🍊 🍊" },               // larger, 1 🍎
  { description: "🍎 🍌 🍊 🌴 🫐 🍈 🍇 🌰" },      // large, 1 🍎
  { description: "🍎 🍎 🍎 🍎 🍎 🍎" },           // large, 6 🍎
  { description: "🍎 🍌" },                       // very short, 1 🍎
  { description: "🍌 🍊 🌴 🫐 🍈 🍇 🌰 🍎" },      // large, 1 🍎
  { description: "🍎 🍎 🍌 🍌 🍌" }               // shorter, 2 🍎
]);

db.articles.createSearchIndex("default", {
  mappings: { dynamic: true }
});

带分数细节的查询

db.articles.aggregate([
  {
    $search: {
      text: { query: ["🍎", "🍏"], path: "description" },
      index: "default",
      scoreDetails: true
    }
  },
  {
    $project: {
      _id: 0,
      description: 1,
      score: { $meta: "searchScore" },
      scoreDetails: { $meta: "searchScoreDetails" }
    }
  },
  { $sort: { score: -1 } },
  { $limit: 1 }
]);

结果

[
  {
    "description": "🍏 🍌 🍊",
    "score": 1.0242118835449219,
    "scoreDetails": {
      "value": 1.0242118835449219,
      "description": "sum of:",
      "details": [
        {
          "value": 1.0242118835449219,
          "description": "$type:string/description:🍏 [BM25Similarity], result of:",
          "details": [
            {
              "value": 1.0242118835449219,
              "description": "score(freq=1.0), computed as boost * idf * tf from:",
              "details": [
                {
                  "value": 1.8971199989318848,
                  "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                  "details": [
                    { "value": 1, "description": "n, number of documents containing term", "details": [] },
                    { "value": 9, "description": "N, total number of documents with field", "details": [] }
                  ]
                },
                {
                  "value": 0.5398772954940796,
                  "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                  "details": [
                    { "value": 1, "description": "freq, occurrences of term within document", "details": [] },
                    { "value": 1.2000000476837158, "description": "k1, term saturation parameter", "details": [] },
                    { "value": 0.75, "description": "b, length normalization parameter", "details": [] },
                    { "value": 3, "description": "dl, length of field", "details": [] },
                    { "value": 4.888888835906982, "description": "avgdl, average length of field", "details": [] }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  }
]

观察

MongoDB Atlas 返回的 BM25 分数大约比 Elasticsearch 和 ParadeDB 对相同查询和数据集产生的分数低 2.2 倍。
详细的拆分显示 idf 和 tf 组件计算是正确的；差异似乎来源于最终的乘法步骤（例如 Atlas 内部使用了不同的 boost 因子或归一化步骤）。

下一步

Validate boost settings – ensure no hidden boost is applied to the index or query.
Compare raw term frequencies – confirm that freq, dl, and avgdl match across engines.
Reach out to MongoDB support – share the detailed score breakdown to investigate the scaling factor.

Feel free to comment or open a discussion if you have insights into why Atlas applies this scaling!

🍏 🍌 🍊 的评分细分

查询产生的得分为 1.0242118835449219。

IDF 计算（逆文档频率）

搜索结果

包含该词项的文档数：n = 1
拥有此字段的文档总数：N = 9

idf = log(1 + (N - n + 0.5) / (n + 0.5))
    = log(1 + (9 - 1 + 0.5) / (1 + 0.5))
    = log(6.666666666666667)
    ≈ 1.8971199989318848

TF 计算（词项频率）

参数（Lucene 默认值）

词项饱和度参数：k1 = 1.2000000476837158
长度归一化参数：b = 0.75

文档字段统计信息

字段的平均长度：avgdl = 44 / 9 ≈ 4.888888835906982
文档长度（dl）：3
该文档中词项出现次数：freq = 1

tf = freq / (freq + k1 * (1 - b + b * dl / avgdl))
   = 1 / (1 + 1.2000000476837158 × (0.25 + 0.75 × (3 / 4.888888835906982)))
   ≈ 0.5398772954940796

最终得分

参数

Boost（提升因子）：1.0

score = boost × idf × tf
      = 1.0 × 1.8971199989318848 × 0.5398772954940796
      ≈ 1.0242118835449219

这证实 Atlas Search 使用的评分公式与 Lucene 完全相同。

关于 Elasticsearch 和 Tantivy？

八年前，Lucene 在 LUCENE‑8563 中移除了 (k1 + 1) 因子。
对于 k1 = 1.2，从该版本起，这一改动将得分降低约 ≈ 2.2 倍。

Elasticsearch 和 Tantivy 仍然使用旧公式（带有 (k1 + 1) 因子）。
Atlas Search 使用了更新后的 Lucene 公式，这解释了观察到的得分差异。

结论

MongoDB Atlas Search 索引使用与 Lucene 索引相同的 BM25 评分机制。
在将 Atlas Search 与其他基于 Lucene 的引擎（例如 Elasticsearch、Tantivy）进行比较时，您可能会看到约 2.2 倍 的评分差异。
这种差异不会影响结果顺序——评分仅用于排名，且相对顺序保持一致。

文本搜索评分是确定性的，基于开源公式。在 MongoDB 中，您可以在搜索查询中请求 score details，以查看生成特定评分的所有参数和计算过程：

// Original snippet (kept for reference)
[
  // …
]

Atlas Search 评分细节（BM25 计算）

重温 “使用 MongoDB（BM25 TF‑IDF）和 PostgreSQL 的文本搜索”

分数细节

测试数据

带分数细节的查询

结果

观察

下一步

🍏 🍌 🍊 的评分细分

IDF 计算（逆文档频率）

TF 计算（词项频率）

最终得分

关于 Elasticsearch 和 Tantivy？

结论

相关文章

管理 Mini-Page Memory：Bf-Tree 背后的 Buffer Pool

为什么 Bf-Tree 固定内部节点以及它解锁的内容

Mini-Pages：重新思考叶页边界

Clearspace（YC W23）招聘创始网络工程师（VPN 和 Proxy）

重温 “使用 MongoDB（BM25 TF‑IDF）和 PostgreSQL 的文本搜索”

分数细节

测试数据

带分数细节的查询

结果

观察

下一步

🍏 🍌 🍊 的评分细分

IDF 计算（逆文档频率）

TF 计算（词项频率）

最终得分

关于 Elasticsearch 和 Tantivy？

结论

相关文章

管理 Mini-Page Memory：Bf-Tree 背后的 Buffer Pool

为什么 Bf-Tree 固定内部节点以及它解锁的内容

Mini-Pages：重新思考叶页边界

Clearspace（YC W23）招聘创始网络工程师（VPN 和 Proxy）

重温 “使用 MongoDB（BM25 TF‑IDF）和 PostgreSQL 的文本搜索”