Python에서 Ollama Web Search API 사용

발행: 21시간 전 (2025년 12월 5일 오후 02:41 GMT+9)

5 min read

Source: Dev.to

시작하기

설치
버전 0.6.0 이상을 설치합니다:

pip install 'ollama>=0.6.0'

환경 설정
Python 환경과 패키지를 관리하려면 uv 또는 venv를 사용한 가상 환경을 고려하세요.

Ollama 계정에서 API 키를 생성하고 환경 변수로 설정합니다:

export OLLAMA_API_KEY="your_api_key"

Windows PowerShell에서는:

$env:OLLAMA_API_KEY = "your_api_key"

기본 웹 검색

import ollama

# 간단한 웹 검색
response = ollama.web_search("What is Ollama?")
print(response)

샘플 출력

results = [
    {
        "title": "Ollama",
        "url": "https://ollama.com/",
        "content": "Cloud models are now available in Ollama..."
    },
    {
        "title": "What is Ollama? Features, Pricing, and Use Cases",
        "url": "https://www.walturn.com/insights/what-is-ollama",
        "content": "Our services..."
    },
    {
        "title": "Complete Ollama Guide: Installation, Usage & Code Examples",
        "url": "https://collabnix.com/complete-ollama-guide",
        "content": "Join our Discord Server..."
    }
]

결과 개수 제어

import ollama

# 더 많은 결과 가져오기
response = ollama.web_search("latest AI news", max_results=10)

for result in response.results:
    print(f"📌 {result.title}")
    print(f"   {result.url}")
    print(f"   {result.content[:100]}...")
    print()

전체 페이지 내용 가져오기

web_search는 제목, URL, 스니펫을 포함한 여러 검색 결과를 반환합니다.
web_fetch는 특정 URL의 전체 내용을 가져와 페이지 제목, 마크다운 콘텐츠, 링크를 반환합니다.

from ollama import web_fetch

result = web_fetch('https://ollama.com')
print(result)

샘플 출력

WebFetchResponse(
    title='Ollama',
    content='[Cloud models](https://ollama.com/blog/cloud-models) are now available in Ollama\n\n**Chat & build with open models**\n\n[Download](https://ollama.com/download) [Explore models](https://ollama.com/models)\n\nAvailable for macOS, Windows, and Linux',
    links=['https://ollama.com/', 'https://ollama.com/models', 'https://github.com/ollama/ollama']
)

HTML을 Markdown으로 변환하는 방법에 대해서는 Python으로 HTML을 Markdown으로 변환하기 가이드를 참고하세요.

검색과 가져오기 결합

일반적인 패턴은 먼저 검색하고, 관련 결과에서 전체 내용을 가져오는 것입니다:

from ollama import web_search, web_fetch

# 정보 검색
search_results = web_search("Ollama new features 2025")

# 첫 번째 결과에서 전체 내용 가져오기
if search_results.results:
    first_url = search_results.results[0].url
    full_content = web_fetch(first_url)

    print(f"Title: {full_content.title}")
    print(f"Content: {full_content.content[:500]}...")
    print(f"Links found: {len(full_content.links)}")

검색 에이전트 만들기

툴 사용 능력이 뛰어난 모델이 가장 잘 작동합니다. 예: qwen3, gpt-oss, 그리고 qwen3:480b-cloud, deepseek-v3.1-cloud 같은 클라우드 모델. 고급 사용 사례는 Ollama와 Qwen3를 활용한 구조화된 출력 LLM 가이드 를 참고하세요.

먼저, 성능 좋은 모델을 가져옵니다:

ollama pull qwen3:4b

간단한 검색 에이전트

검색이 필요할 때 스스로 판단할 수 있는 기본 검색 에이전트 예시:

from ollama import chat, web_fetch, web_search

available_tools = {'web_search': web_search, 'web_fetch': web_fetch}
messages = [{'role': 'user', 'content': "what is ollama's new engine"}]

while True:
    response = chat(
        model='qwen3:4b',
        messages=messages,
        tools=[web_search, web_fetch],
        think=True
    )

    if response.message.thinking:
        print('🧠 Thinking:', response.message.thinking[:200], '...')

    if response.message.content:
        print('💬 Response:', response.message.content)

    messages.append(response.message)

    if response.message.tool_calls:
        print('🔧 Tool calls:', response.message.tool_calls)
        for tool_call in response.message.tool_calls:
            function_to_call = available_tools.get(tool_call.function.name)
            if function_to_call:
                args = tool_call.function.arguments
                result = function_to_call(**args)
                print('📥 Result:', str(result)[:200], '...')
                # 제한된 컨텍스트 길이를 위해 결과를 잘라냅니다
                messages.append({
                    'role': 'tool',
                    'content': str(result)[:2000 * 4],
                    'tool_name': tool_call.function.name
                })
            else:
                messages.append({
                    'role': 'tool',
                    'content': f'Tool {tool_call.function.name} not found',
                    'tool_name': tool_call.function.name
                })
    else:
        break

대용량 웹 검색 결과 처리
모델에 전달하기 전에 결과를 컨텍스트 제한(≈ 8000자, 즉 2000 토큰 × 4 문자) 에 맞게 잘라야 합니다.

오류 처리를 포함한 고급 검색 에이전트

에러 처리를 강화한 개선된 버전:

from ollama import chat, web_fetch, web_search
import json

class SearchAgent:
    def __init__(self, model: str = 'qwen3:4b'):
        self.model = model
        self.tools = {'web_search': web_search, 'web_fetch': web_fetch}
        self.messages = []
        self.max_iterations = 10

    def query(self, question: str) -> str:
        self.messages = [{'role': 'user', 'content': question}]

        for iteration in range(self.max_iterations):
            try:
                response = chat(
                    model=self.model,
                    messages=self.messages,
                    tools=[web_search, web_fetch],
                    think=True
                )
            except Exception as e:
                return f"Error during chat: {e}"

            self.messages.append(response.message)

            # 툴 호출이 없으면 최종 답변이 된 것입니다
            if not response.message.tool_calls:
                return

Python에서 Ollama Web Search API 사용

시작하기

기본 웹 검색

결과 개수 제어

전체 페이지 내용 가져오기

검색과 가져오기 결합

검색 에이전트 만들기

간단한 검색 에이전트

오류 처리를 포함한 고급 검색 에이전트

관련 글

Ingress NGINX에서 Pomerium Ingress Controller로 마이그레이션

Python 50줄로 MCP 서버를 만든 방법 (OpenAPI에서 자동 생성)

3계층 Terraform 아키텍처

Bandit을 SAST 도구로 사용하여 Python 앱 보안하기