RAG는 벡터 검색 그 이상이다

발행: 6일 전 (2025년 12월 13일 오전 04:36 GMT+9)

5 min read

원문: Dev.to

Source: Dev.to

Retrieval Augmented Generation (RAG)은 종종 벡터 검색과 연관됩니다. 이것이 주요 사용 사례이지만, 어떤 검색 방법도 사용할 수 있습니다.

✅ 벡터 검색
✅ 웹 검색
✅ SQL 쿼리

These examples require txtai 9.3+.

Install dependencies

txtai와 모든 종속성을 설치합니다.

pip install txtai[pipeline-data]

# Download example SQL database
wget https://huggingface.co/NeuML/txtai-wikipedia-slim/resolve/main/documents

RAG with Late Interaction

첫 번째 예제는 ColBERT / Late Interaction 검색을 사용한 RAG를 보여줍니다. TxtAI 9.0에서는 MUVERA와 ColBERT 다중 벡터 랭킹을 지원하도록 추가되었습니다.

우리는:

ColBERT v2 논문을 읽고 텍스트를 섹션으로 추출합니다.
ColBERT 모델을 사용해 인덱스를 구축합니다.
같은 모델을 사용해 이를 Reranker 파이프라인으로 래핑합니다.
이 검색 방법을 활용하는 RAG 파이프라인을 사용합니다.

Note: 이는 맞춤형 ColBERT Muvera Nano 모델(≈970 K 파라미터)을 사용합니다. 놀라울 정도로 효과적입니다.

from txtai import Embeddings, RAG, Textractor
from txtai.pipeline import Reranker, Similarity

# Get text from ColBERT v2 paper
textractor = Textractor(sections=True, backend="docling")
data = textractor("https://arxiv.org/pdf/2112.01488")

# MUVERA fixed‑dimensional encodings
embeddings = Embeddings(
    content=True,
    path="neuml/colbert-muvera-nano",
    vectors={"trust_remote_code": True},
)
embeddings.index(data)

# Re‑rank using the same late‑interaction model
reranker = Reranker(
    embeddings,
    Similarity(
        "neuml/colbert-muvera-nano",
        lateencode=True,
        vectors={"trust_remote_code": True},
    ),
)

template = """
Answer the following question using the provided context.

Question:
{question}

Context:
{context}
"""

# RAG with late interaction models
rag = RAG(reranker, "Qwen/Qwen3-4B-Instruct-2507", template=template, output="flatten")
print(rag("Write a sentence abstract about this paper", maxlength=2048))

This paper introduces ColBERTv2, a neural information retrieval model that enhances the quality and efficiency of late interaction by combining an aggressive residual compression mechanism with a denoised supervision strategy, achieving state‑of‑the‑art performance across diverse benchmarks while reducing the model's space footprint by 6–10× compared to previous methods.

웹 검색을 이용한 RAG

다음으로 웹 검색을 검색 방법으로 사용하는 RAG 파이프라인을 실행합니다.

from smolagents import WebSearchTool

tool = WebSearchTool()

def websearch(queries, limit):
    results = []
    for query in queries:
        result = [
            {"id": i, "text": f'{x["title"]} {x["description"]}', "score": 1.0}
            for i, x in enumerate(tool.search(query))
        ]
        results.append(result[:limit])
    return results

# RAG with a websearch
rag = RAG(websearch, "Qwen/Qwen3-4B-Instruct-2507", template=template, output="flatten")
print(rag("What is AI?", maxlength=2048))

Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem‑solving, perception, and decision‑making. It involves technologies like machine learning, deep learning, and natural language processing, and enables machines to simulate human‑like learning, comprehension, problem solving, decision‑making, creativity, and autonomy.

SQL 쿼리를 이용한 RAG

마지막 예제는 SQL 쿼리를 이용한 RAG를 보여줍니다. 우리는 txtai‑wikipedia‑slim 임베딩 데이터셋에 포함된 SQLite 데이터베이스를 사용할 것입니다.

자연어 쿼리를 SQL LIKE 절로 변환해야 합니다. 이를 위해 LLM이 키워드를 추출합니다.

import sqlite3
from txtai import LLM

def keyword(query):
    return llm(f"""
        Extract a keyword for this search query: {query}.
        Return only text with no other formatting or explanation.
    """)

def sqlsearch(queries, limit):
    results = []
    sql = "SELECT id, text FROM sections WHERE id LIKE ? LIMIT ?"

    for query in queries:
        # Extract a keyword for this search
        kw = keyword(query)

        # Run the SQL query
        results.append([
            {"id": uid, "text": text, "score": 1.0}
            for uid, text in cursor.execute(sql, [f"%{kw}%", limit])
        ])

    return results

# Load the database
cursor = sqlite3.connect("documents")

# Load the LLM
llm = LLM("Qwen/Qwen3-4B-Instruct-2507")

# RAG with a SQL query
rag = RAG(sqlsearch, llm, template=template, output="flatten")
print(rag("Tell me what happened in the 2025 World Series", maxlength=2048))

In the 2025 World Series, the Los Angeles Dodgers defeated the Toronto Blue Jays in seven games to win the championship. The series took place from October 24 to November 1 (ending early on November 2, Toronto time). Dodgers pitcher Yoshinobu Yamamoto was named the World Series MVP. The series was televised by Fox in the United States and by Sportsnet in Canada.

마무리

이 글에서는 RAG가 벡터 검색보다 훨씬 더 많은 것을 포함한다는 것을 보여주었습니다. txtai 9.3+에서는 이제 모든 호출 가능한 방법을 검색에 사용할 수 있습니다. 즐기세요!

RAG는 벡터 검색 그 이상이다

Install dependencies

RAG with Late Interaction

웹 검색을 이용한 RAG

SQL 쿼리를 이용한 RAG

마무리

관련 글

우리 사이트가 싱가포르에서는 느리고 유럽에서는 완벽했는데, 그 이유는.

나는 Game Boy를 ChatGPT 안에 넣었다 (ChatGPT Apps)

Microsoft Planner를 사용하는 마케팅 매니저의 하루

spaceorbust – GitHub 커밋으로 우주 문명을 움직이는 터미널 RPG