构建您的第一个 Agentic AI：MCP + Ollama 工具调用完整指南

发布: 2小时前 (2025年12月12日 GMT+8 17:54)

6 min read

原文: Dev.to

Source: Dev.to

Introduction

了解如何构建能够使用工具、做出决策并执行操作的 AI 代理——全部在本地机器上运行。

Background

Ollama – 本地 AI 模型运行时（类似于 AI 版 Docker）。
Tool calling – 让大语言模型能够识别何时需要函数、选择合适的工具、传递正确的参数并解释结果。
MCP (Model‑Connector‑Protocol) – 为 LLM 与工具和数据源连接提供的标准化方式（相当于 AI 的 USB‑C）。FastMCP 是一个 Python 库，简化了 MCP 服务器的构建。

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                            │
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐ │
│  │   Ollama     │◄──►│   Python     │◄──►│   FastMCP    │ │
│  │ (AI Brain)  │    │   Client     │    │   Server     │ │
│  └──────────────┘    └──────────────┘    └──────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Flow

你提出一个问题。
Python 客户端将问题连同可用工具列表发送给 Ollama。
Ollama 决定是否需要使用工具。
客户端通过 FastMCP 调用工具。
结果返回给 Ollama。
Ollama 生成最终答案。

Ollama Setup

Linux

curl -fsSL https://ollama.ai/install.sh | sh

macOS

brew install ollama

Windows

Download from and run ollama --version.
You should see something like ollama version 0.1.x. Then start the server:

ollama serve

Keep this terminal open.

Pull a model that supports tool calling

ollama pull llama3.2   # ~2 GB download
ollama run llama3.2

Test it:

>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...

>>> /bye

Model Support Table

Model	Size	Speed	Best For
llama3.2	3 GB	Fast	Recommended for this tutorial
llama3.1	5 GB	Medium	More accurate responses
mistral	4 GB	Fast	Good general purpose
qwen2.5	4 GB	Fast	Multilingual support

Models without tool support

Model	Why Not?
codellama	Built only for code generation
llama2	Older architecture, no tool support
phi	Too small for complex tool reasoning

Attempting to use them yields: Error: does not support tools (status code: 400).

Project Setup

mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial

# Virtual environment
python -m venv myenv
source myenv/bin/activate   # Linux/macOS
# myenv\Scripts\activate      # Windows

pip install fastmcp ollama requests
python -c "import fastmcp, ollama, requests; print('✅ All packages installed!')"

Building the MCP Server

Create mcp_server.py：

# mcp_server.py
from fastmcp import FastMCP

mcp = FastMCP("My First MCP Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

@mcp.tool()
def greet(name: str) -> str:
    """Greet someone by name"""
    return f"Hello, {name}! Welcome!"

@mcp.tool()
def multiply(a: float, b: float) -> float:
    """Multiply two numbers"""
    return a * b

@mcp.tool()
def get_time() -> str:
    """Get the current time"""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

if __name__ == "__main__":
    mcp.run(transport="sse", port=8080)

Run the server:

python mcp_server.py

You should see:

INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8080
✅ Checkpoint: Your tool server is running!

Keep this terminal open.

Building the Client

Create client_ollama.py：

# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio

OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"

# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
    """Connect to MCP server and get list of available tools"""
    async with MCPClient(MCP_SERVER_URL) as mcp:
        return await mcp.get_tools()

# ------------------------------------------------------------
# Step 2: Send a user query to Ollama, letting it decide on tools
# ------------------------------------------------------------
async def ask_ollama(prompt: str, tools):
    """Send prompt + tool schema to Ollama and get a response"""
    # Ollama expects a JSON schema describing the tools
    tool_schema = {"tools": tools}
    response = await ollama.Chat(
        model=OLLAMA_MODEL,
        messages=[{"role": "user", "content": prompt}],
        options={"tool_schema": json.dumps(tool_schema)},
    )
    return response

# ------------------------------------------------------------
# Step 3: Execute any required tool calls and feed results back
# ------------------------------------------------------------
async def run():
    tools = await load_mcp_tools()
    user_prompt = input("You: ")
    response = await ask_ollama(user_prompt, tools)

    # If Ollama requests a tool call, execute it
    if "tool_calls" in response:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            for call in response["tool_calls"]:
                tool_name = call["name"]
                args = call["arguments"]
                result = await mcp.call(tool_name, **args)
                # Send the result back to Ollama (simplified)
                response = await ollama.Chat(
                    model=OLLAMA_MODEL,
                    messages=[
                        {"role": "assistant", "content": response["content"]},
                        {"role": "tool", "name": tool_name, "content": json.dumps(result)},
                    ],
                )
    print("\nAI:", response["content"])

if __name__ == "__main__":
    asyncio.run(run())

Run the client in a new terminal:

python client_ollama.py

You can now interact with the agent:

You: Hey, greet Alice and then calculate 150 + 75
AI: *Thinking... I need to use the greet tool and the add tool*
AI: Hello, Alice! Welcome! The sum of 150 and 75 is 225.

Running the System

Terminal 1 – start Ollama: ollama serve
Terminal 2 – run the MCP server: python mcp_server.py
Terminal 3 – run the client: python client_ollama.py

How It Works (Deep Dive)

Tool discovery – the client fetches a JSON schema from the MCP server (/tools).
Prompt augmentation – the schema is sent to Ollama so the model knows which functions are available.
Decision – Ollama decides whether a tool call is needed based on the user query.
Execution – the client invokes the appropriate function via FastMCP (HTTP/SSE).
Result integration – the tool’s output is fed back to Ollama, which composes the final answer.

Customization

Add new tools by defining additional @mcp.tool() functions in mcp_server.py. Example:

@mcp.tool()
def get_weather(city: str) -> str:
    """Return a simple weather description for the given city."""
    # Placeholder implementation
    return f"The weather in {city} is sunny with 22°C."

Restart the server and the client will automatically discover the new tool.

Troubleshooting

Problem	Solution
Ollama cannot find the model	Verify the model name (`ollama list`) and that `ollama serve` is running.
Tool not recognized	Ensure the MCP server is running and reachable at `MCP_SERVER_URL`.
Port conflict (8080)	Change the port in `mcp_server.py` and update `MCP_SERVER_URL` accordingly.
Python virtual environment not activated	Activate it (`source myenv/bin/activate` or `myenv\Scripts\activate`).
Model returns incorrect calculation	Confirm the model you pulled supports tool calling (e.g., `llama3.2`).

Real Project Ideas

Email Assistant

Goal: Manage emails (list, read, draft, send) via natural language.
Tools: list_emails(), read_email(id), draft_reply(id, content), send_email(id).
Tech Stack: FastMCP, imaplib/smtplib, SQLite for local cache.

Personal Knowledge Base

Goal: Smart note‑taking with tagging, search, and summarization.
Tools: add_note(title, body), search_notes(query), summarize_note(id).
Tech Stack: FastMCP, sqlite3, optional embedding model (e.g., sentence‑transformers).

Finance Manager

Goal: Track expenses, generate budgets, and answer financial queries.
Tools: add_transaction(date, amount, category), monthly_report(month).
Tech Stack: FastMCP, pandas, matplotlib for visual reports.

Smart Home Controller

Goal: Control IoT devices (lights, thermostat, locks) via voice/text.
Tools: set_light(room, state), set_temperature(value), lock_door(door).
Tech Stack: FastMCP, MQTT or local REST APIs of smart devices.

Data Analysis Assistant

Goal: Load CSV/Excel files, run analyses, and produce charts.
Tools: load_dataset(path), describe_data(), plot(column_x, column_y).
Tech Stack: FastMCP, pandas, seaborn/matplotlib.

Study Assistant

Goal: Generate flashcards, quizzes, and spaced‑repetition schedules.
Tools: create_flashcards(topic), quiz_user(topic), schedule_review(topic).
Tech Stack: FastMCP, sqlite3, optional LLM for content generation.

Each project includes actual code snippets (as shown above), example usage, and a recommended tech stack. By following the guide you’ll have a functional, locally‑run AI agent capable of understanding natural‑language requests, deciding which tools to invoke, executing them, and returning intelligent responses—all without external API keys or cloud costs.