RAG 不仅仅是 Vector Search
Source: Dev.to
检索增强生成(RAG)通常与向量搜索关联。虽然这是主要的使用场景,但任何搜索方法都可以使用。
- ✅ 向量搜索
- ✅ 网络搜索
- ✅ SQL 查询
这些示例需要 txtai 9.3+。
安装依赖
安装 txtai 及其所有依赖。
pip install txtai[pipeline-data]
# 下载示例 SQL 数据库
wget https://huggingface.co/NeuML/txtai-wikipedia-slim/resolve/main/documents
使用延迟交互的 RAG
第一个示例演示了使用 ColBERT / 延迟交互检索的 RAG。TxtAI 9.0 添加了对 MUVERA 和 ColBERT 多向量排序的支持。
我们将:
- 读取 ColBERT v2 论文并将其文本提取为章节。
- 使用 ColBERT 模型构建索引。
- 将其包装为使用相同模型的 Reranker 流水线。
- 使用利用该检索方法的 RAG 流水线。
注意: 这里使用的是自定义的 ColBERT Muvera Nano 模型(≈970 K 参数),效果出乎意料地好。
from txtai import Embeddings, RAG, Textractor
from txtai.pipeline import Reranker, Similarity
# Get text from ColBERT v2 paper
textractor = Textractor(sections=True, backend="docling")
data = textractor("https://arxiv.org/pdf/2112.01488")
# MUVERA fixed‑dimensional encodings
embeddings = Embeddings(
content=True,
path="neuml/colbert-muvera-nano",
vectors={"trust_remote_code": True},
)
embeddings.index(data)
# Re‑rank using the same late‑interaction model
reranker = Reranker(
embeddings,
Similarity(
"neuml/colbert-muvera-nano",
lateencode=True,
vectors={"trust_remote_code": True},
),
)
template = """
Answer the following question using the provided context.
Question:
{question}
Context:
{context}
"""
# RAG with late interaction models
rag = RAG(reranker, "Qwen/Qwen3-4B-Instruct-2507", template=template, output="flatten")
print(rag("Write a sentence abstract about this paper", maxlength=2048))
This paper introduces ColBERTv2, a neural information retrieval model that enhances the quality and efficiency of late interaction by combining an aggressive residual compression mechanism with a denoised supervision strategy, achieving state‑of‑the‑art performance across diverse benchmarks while reducing the model's space footprint by 6–10× compared to previous methods.
使用网络搜索的 RAG
接下来我们运行一个使用网络搜索作为检索方法的 RAG 流水线。
from smolagents import WebSearchTool
tool = WebSearchTool()
def websearch(queries, limit):
results = []
for query in queries:
result = [
{"id": i, "text": f'{x["title"]} {x["description"]}', "score": 1.0}
for i, x in enumerate(tool.search(query))
]
results.append(result[:limit])
return results
# RAG with a websearch
rag = RAG(websearch, "Qwen/Qwen3-4B-Instruct-2507", template=template, output="flatten")
print(rag("What is AI?", maxlength=2048))
Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem‑solving, perception, and decision‑making. It involves technologies like machine learning, deep learning, and natural language processing, and enables machines to simulate human‑like learning, comprehension, problem solving, decision‑making, creativity, and autonomy.
使用 SQL 查询的 RAG
最后的示例展示了使用 SQL 查询的 RAG。我们将使用属于 txtai‑wikipedia‑slim 嵌入数据集的 SQLite 数据库。
我们需要将自然语言查询转换为 SQL LIKE 子句。LLM 会为此提取关键词。
import sqlite3
from txtai import LLM
def keyword(query):
return llm(f"""
Extract a keyword for this search query: {query}.
Return only text with no other formatting or explanation.
""")
def sqlsearch(queries, limit):
results = []
sql = "SELECT id, text FROM sections WHERE id LIKE ? LIMIT ?"
for query in queries:
# Extract a keyword for this search
kw = keyword(query)
# Run the SQL query
results.append([
{"id": uid, "text": text, "score": 1.0}
for uid, text in cursor.execute(sql, [f"%{kw}%", limit])
])
return results
# Load the database
cursor = sqlite3.connect("documents")
# Load the LLM
llm = LLM("Qwen/Qwen3-4B-Instruct-2507")
# RAG with a SQL query
rag = RAG(sqlsearch, llm, template=template, output="flatten")
print(rag("Tell me what happened in the 2025 World Series", maxlength=2048))
In the 2025 World Series, the Los Angeles Dodgers defeated the Toronto Blue Jays in seven games to win the championship. The series took place from October 24 to November 1 (ending early on November 2, Toronto time). Dodgers pitcher Yoshinobu Yamamoto was named the World Series MVP. The series was televised by Fox in the United States and by Sportsnet in Canada.
总结
本文展示了 RAG 远不止向量搜索。使用 txtai 9.3+,任何可调用的方法现在都可以用于检索。祝使用愉快!