첫 번째 에이전틱 AI 만들기: MCP + Ollama 툴 호출 완전 가이드

발행: 2시간 전 (2025년 12월 12일 오후 06:54 GMT+9)

9 min read

원문: Dev.to

Source: Dev.to

소개

AI 에이전트를 구축하는 방법을 배워보세요. 도구를 사용하고, 결정을 내리며, 행동을 취할 수 있는 에이전트를 로컬 머신에서 실행할 수 있습니다.

배경

Ollama – 로컬 AI 모델 런타임 (AI용 Docker와 유사).
Tool calling – LLM이 함수가 필요함을 인식하고, 적절한 도구를 선택해 올바른 매개변수를 전달하고, 결과를 해석하도록 합니다.
MCP (Model‑Connector‑Protocol) – LLM이 도구와 데이터 소스에 연결되는 표준화된 방법 (AI용 USB‑C라고 생각하면 됩니다). FastMCP는 MCP 서버 구축을 간소화하는 파이썬 라이브러리입니다.

아키텍처

┌─────────────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                            │
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐ │
│  │   Ollama     │◄──►│   Python     │◄──►│   FastMCP    │ │
│  │ (AI Brain)  │    │   Client     │    │   Server     │ │
│  └──────────────┘    └──────────────┘    └──────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

흐름

질문을 입력합니다.
파이썬 클라이언트가 사용 가능한 도구 목록과 함께 Ollama에 전송합니다.
Ollama가 도구가 필요한지 판단합니다.
클라이언트가 FastMCP를 통해 도구를 호출합니다.
결과가 Ollama에 다시 전달됩니다.
Ollama가 최종 답변을 생성합니다.

Ollama 설정

Linux

curl -fsSL https://ollama.ai/install.sh | sh

macOS

brew install ollama

Windows

Download from and run ollama --version.
You should see something like ollama version 0.1.x. Then start the server:

ollama serve

Keep this terminal open.

도구 호출을 지원하는 모델 다운로드

ollama pull llama3.2   # ~2 GB download
ollama run llama3.2

테스트:

>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...

>>> /bye

모델 지원 표

Model	Size	Speed	Best For
llama3.2	3 GB	Fast	Recommended for this tutorial
llama3.1	5 GB	Medium	More accurate responses
mistral	4 GB	Fast	Good general purpose
qwen2.5	4 GB	Fast	Multilingual support

도구를 지원하지 않는 모델

Model	Why Not?
codellama	Built only for code generation
llama2	Older architecture, no tool support
phi	Too small for complex tool reasoning

사용하려고 하면 Error: does not support tools (status code: 400) 오류가 발생합니다.

프로젝트 설정

mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial

# Virtual environment
python -m venv myenv
source myenv/bin/activate   # Linux/macOS
# myenv\Scripts\activate      # Windows

pip install fastmcp ollama requests
python -c "import fastmcp, ollama, requests; print('✅ All packages installed!')"

MCP 서버 구축

mcp_server.py 파일을 만들고:

# mcp_server.py
from fastmcp import FastMCP

mcp = FastMCP("My First MCP Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

@mcp.tool()
def greet(name: str) -> str:
    """Greet someone by name"""
    return f"Hello, {name}! Welcome!"

@mcp.tool()
def multiply(a: float, b: float) -> float:
    """Multiply two numbers"""
    return a * b

@mcp.tool()
def get_time() -> str:
    """Get the current time"""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

if __name__ == "__main__":
    mcp.run(transport="sse", port=8080)

서버 실행:

python mcp_server.py

다음과 같은 메시지가 표시됩니다:

INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8080
✅ Checkpoint: Your tool server is running!

이 터미널을 열어 둡니다.

클라이언트 구축

client_ollama.py 파일을 만들고:

# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio

OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"

# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
    """Connect to MCP server and get list of available tools"""
    async with MCPClient(MCP_SERVER_URL) as mcp:
        return await mcp.get_tools()

# ------------------------------------------------------------
# Step 2: Send a user query to Ollama, letting it decide on tools
# ------------------------------------------------------------
async def ask_ollama(prompt: str, tools):
    """Send prompt + tool schema to Ollama and get a response"""
    # Ollama expects a JSON schema describing the tools
    tool_schema = {"tools": tools}
    response = await ollama.Chat(
        model=OLLAMA_MODEL,
        messages=[{"role": "user", "content": prompt}],
        options={"tool_schema": json.dumps(tool_schema)},
    )
    return response

# ------------------------------------------------------------
# Step 3: Execute any required tool calls and feed results back
# ------------------------------------------------------------
async def run():
    tools = await load_mcp_tools()
    user_prompt = input("You: ")
    response = await ask_ollama(user_prompt, tools)

    # If Ollama requests a tool call, execute it
    if "tool_calls" in response:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            for call in response["tool_calls"]:
                tool_name = call["name"]
                args = call["arguments"]
                result = await mcp.call(tool_name, **args)
                # Send the result back to Ollama (simplified)
                response = await ollama.Chat(
                    model=OLLAMA_MODEL,
                    messages=[
                        {"role": "assistant", "content": response["content"]},
                        {"role": "tool", "name": tool_name, "content": json.dumps(result)},
                    ],
                )
    print("\nAI:", response["content"])

if __name__ == "__main__":
    asyncio.run(run())

새 터미널에서 클라이언트를 실행:

python client_ollama.py

이제 에이전트와 대화할 수 있습니다:

You: Hey, greet Alice and then calculate 150 + 75
AI: *Thinking... I need to use the greet tool and the add tool*
AI: Hello, Alice! Welcome! The sum of 150 and 75 is 225.

시스템 실행

터미널 1 – Ollama 시작: ollama serve
터미널 2 – MCP 서버 실행: python mcp_server.py
터미널 3 – 클라이언트 실행: python client_ollama.py

작동 원리 (깊이 있는 설명)

Tool discovery – 클라이언트가 MCP 서버( /tools)에서 JSON 스키마를 가져옵니다.
Prompt augmentation – 스키마가 Ollama에 전달돼 모델이 어떤 함수가 가능한지 알게 됩니다.
Decision – Ollama는 사용자 질의에 따라 도구 호출이 필요한지 판단합니다.
Execution – 클라이언트가 FastMCP(HTTP/SSE)를 통해 해당 함수를 실행합니다.
Result integration – 도구 출력이 Ollama에 다시 전달돼 최종 답변을 구성합니다.

커스터마이징

mcp_server.py에 @mcp.tool() 함수를 추가하면 새로운 도구를 만들 수 있습니다. 예시:

@mcp.tool()
def get_weather(city: str) -> str:
    """Return a simple weather description for the given city."""
    # Placeholder implementation
    return f"The weather in {city} is sunny with 22°C."

서버를 재시작하면 클라이언트가 자동으로 새 도구를 발견합니다.

문제 해결

Problem	Solution
Ollama cannot find the model	Verify the model name (`ollama list`) and that `ollama serve` is running.
Tool not recognized	Ensure the MCP server is running and reachable at `MCP_SERVER_URL`.
Port conflict (8080)	Change the port in `mcp_server.py` and update `MCP_SERVER_URL` accordingly.
Python virtual environment not activated	Activate it (`source myenv/bin/activate` or `myenv\Scripts\activate`).
Model returns incorrect calculation	Confirm the model you pulled supports tool calling (e.g., `llama3.2`).

실제 프로젝트 아이디어

이메일 어시스턴트

목표: 자연어로 이메일을 목록화, 읽기, 초안 작성, 전송.
도구: list_emails(), read_email(id), draft_reply(id, content), send_email(id).
기술 스택: FastMCP, imaplib/smtplib, 로컬 캐시용 SQLite.

개인 지식 베이스

목표: 태깅, 검색, 요약이 가능한 스마트 노트.
도구: add_note(title, body), search_notes(query), summarize_note(id).
기술 스택: FastMCP, sqlite3, 선택적 임베딩 모델(sentence‑transformers).

재무 관리 도구

목표: 지출 추적, 예산 생성, 재무 질문에 답변.
도구: add_transaction(date, amount, category), monthly_report(month).
기술 스택: FastMCP, pandas, 시각화용 matplotlib.

스마트 홈 컨트롤러

목표: 음성/텍스트로 IoT 장치(조명, 온도조절기, 잠금) 제어.
도구: set_light(room, state), set_temperature(value), lock_door(door).
기술 스택: FastMCP, MQTT 또는 스마트 디바이스 로컬 REST API.

데이터 분석 어시스턴트

목표: CSV/Excel 파일 로드, 분석 실행, 차트 생성.
도구: load_dataset(path), describe_data(), plot(column_x, column_y).
기술 스택: FastMCP, pandas, seaborn/matplotlib.

학습 보조 도구

목표: 플래시카드 생성, 퀴즈 제공, 간격 반복 일정 관리.
도구: create_flashcards(topic), quiz_user(topic), schedule_review(topic).
기술 스택: FastMCP, sqlite3, 선택적 LLM을 통한 콘텐츠 생성.

각 프로젝트는 위에 보여진 코드 스니펫, 사용 예시, 권장 기술 스택을 포함합니다. 이 가이드를 따라 하면 외부 API 키나 클라우드 비용 없이도 자연어 요청을 이해하고, 적절한 도구를 선택·실행하며, 지능적인 응답을 반환하는 로컬 AI 에이전트를 손쉽게 만들 수 있습니다.