构建您的第一个 Agentic AI:MCP + Ollama 工具调用完整指南
Source: Dev.to
Introduction
了解如何构建能够使用工具、做出决策并执行操作的 AI 代理——全部在本地机器上运行。
Background
- Ollama – 本地 AI 模型运行时(类似于 AI 版 Docker)。
- Tool calling – 让大语言模型能够识别何时需要函数、选择合适的工具、传递正确的参数并解释结果。
- MCP (Model‑Connector‑Protocol) – 为 LLM 与工具和数据源连接提供的标准化方式(相当于 AI 的 USB‑C)。FastMCP 是一个 Python 库,简化了 MCP 服务器的构建。
Architecture
┌─────────────────────────────────────────────────────────────┐
│ YOUR COMPUTER │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Ollama │◄──►│ Python │◄──►│ FastMCP │ │
│ │ (AI Brain) │ │ Client │ │ Server │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Flow
- 你提出一个问题。
- Python 客户端将问题连同可用工具列表发送给 Ollama。
- Ollama 决定是否需要使用工具。
- 客户端通过 FastMCP 调用工具。
- 结果返回给 Ollama。
- Ollama 生成最终答案。
Ollama Setup
Linux
curl -fsSL https://ollama.ai/install.sh | sh
macOS
brew install ollama
Windows
Download from and run ollama --version.
You should see something like ollama version 0.1.x. Then start the server:
ollama serve
Keep this terminal open.
Pull a model that supports tool calling
ollama pull llama3.2 # ~2 GB download
ollama run llama3.2
Test it:
>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...
>>> /bye
Model Support Table
| Model | Size | Speed | Best For |
|---|---|---|---|
| llama3.2 | 3 GB | Fast | Recommended for this tutorial |
| llama3.1 | 5 GB | Medium | More accurate responses |
| mistral | 4 GB | Fast | Good general purpose |
| qwen2.5 | 4 GB | Fast | Multilingual support |
Models without tool support
| Model | Why Not? |
|---|---|
| codellama | Built only for code generation |
| llama2 | Older architecture, no tool support |
| phi | Too small for complex tool reasoning |
Attempting to use them yields: Error: does not support tools (status code: 400).
Project Setup
mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial
# Virtual environment
python -m venv myenv
source myenv/bin/activate # Linux/macOS
# myenv\Scripts\activate # Windows
pip install fastmcp ollama requests
python -c "import fastmcp, ollama, requests; print('✅ All packages installed!')"
Building the MCP Server
Create mcp_server.py:
# mcp_server.py
from fastmcp import FastMCP
mcp = FastMCP("My First MCP Server")
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers together"""
return a + b
@mcp.tool()
def greet(name: str) -> str:
"""Greet someone by name"""
return f"Hello, {name}! Welcome!"
@mcp.tool()
def multiply(a: float, b: float) -> float:
"""Multiply two numbers"""
return a * b
@mcp.tool()
def get_time() -> str:
"""Get the current time"""
from datetime import datetime
return datetime.now().strftime("%I:%M %p")
if __name__ == "__main__":
mcp.run(transport="sse", port=8080)
Run the server:
python mcp_server.py
You should see:
INFO: Started server process
INFO: Uvicorn running on http://127.0.0.1:8080
✅ Checkpoint: Your tool server is running!
Keep this terminal open.
Building the Client
Create client_ollama.py:
# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio
OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"
# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
"""Connect to MCP server and get list of available tools"""
async with MCPClient(MCP_SERVER_URL) as mcp:
return await mcp.get_tools()
# ------------------------------------------------------------
# Step 2: Send a user query to Ollama, letting it decide on tools
# ------------------------------------------------------------
async def ask_ollama(prompt: str, tools):
"""Send prompt + tool schema to Ollama and get a response"""
# Ollama expects a JSON schema describing the tools
tool_schema = {"tools": tools}
response = await ollama.Chat(
model=OLLAMA_MODEL,
messages=[{"role": "user", "content": prompt}],
options={"tool_schema": json.dumps(tool_schema)},
)
return response
# ------------------------------------------------------------
# Step 3: Execute any required tool calls and feed results back
# ------------------------------------------------------------
async def run():
tools = await load_mcp_tools()
user_prompt = input("You: ")
response = await ask_ollama(user_prompt, tools)
# If Ollama requests a tool call, execute it
if "tool_calls" in response:
async with MCPClient(MCP_SERVER_URL) as mcp:
for call in response["tool_calls"]:
tool_name = call["name"]
args = call["arguments"]
result = await mcp.call(tool_name, **args)
# Send the result back to Ollama (simplified)
response = await ollama.Chat(
model=OLLAMA_MODEL,
messages=[
{"role": "assistant", "content": response["content"]},
{"role": "tool", "name": tool_name, "content": json.dumps(result)},
],
)
print("\nAI:", response["content"])
if __name__ == "__main__":
asyncio.run(run())
Run the client in a new terminal:
python client_ollama.py
You can now interact with the agent:
You: Hey, greet Alice and then calculate 150 + 75
AI: *Thinking... I need to use the greet tool and the add tool*
AI: Hello, Alice! Welcome! The sum of 150 and 75 is 225.
Running the System
- Terminal 1 – start Ollama:
ollama serve - Terminal 2 – run the MCP server:
python mcp_server.py - Terminal 3 – run the client:
python client_ollama.py
How It Works (Deep Dive)
- Tool discovery – the client fetches a JSON schema from the MCP server (
/tools). - Prompt augmentation – the schema is sent to Ollama so the model knows which functions are available.
- Decision – Ollama decides whether a tool call is needed based on the user query.
- Execution – the client invokes the appropriate function via FastMCP (HTTP/SSE).
- Result integration – the tool’s output is fed back to Ollama, which composes the final answer.
Customization
Add new tools by defining additional @mcp.tool() functions in mcp_server.py. Example:
@mcp.tool()
def get_weather(city: str) -> str:
"""Return a simple weather description for the given city."""
# Placeholder implementation
return f"The weather in {city} is sunny with 22°C."
Restart the server and the client will automatically discover the new tool.
Troubleshooting
| Problem | Solution |
|---|---|
| Ollama cannot find the model | Verify the model name (ollama list) and that ollama serve is running. |
| Tool not recognized | Ensure the MCP server is running and reachable at MCP_SERVER_URL. |
| Port conflict (8080) | Change the port in mcp_server.py and update MCP_SERVER_URL accordingly. |
| Python virtual environment not activated | Activate it (source myenv/bin/activate or myenv\Scripts\activate). |
| Model returns incorrect calculation | Confirm the model you pulled supports tool calling (e.g., llama3.2). |
Real Project Ideas
Email Assistant
- Goal: Manage emails (list, read, draft, send) via natural language.
- Tools:
list_emails(),read_email(id),draft_reply(id, content),send_email(id). - Tech Stack: FastMCP,
imaplib/smtplib, SQLite for local cache.
Personal Knowledge Base
- Goal: Smart note‑taking with tagging, search, and summarization.
- Tools:
add_note(title, body),search_notes(query),summarize_note(id). - Tech Stack: FastMCP,
sqlite3, optional embedding model (e.g.,sentence‑transformers).
Finance Manager
- Goal: Track expenses, generate budgets, and answer financial queries.
- Tools:
add_transaction(date, amount, category),monthly_report(month). - Tech Stack: FastMCP,
pandas,matplotlibfor visual reports.
Smart Home Controller
- Goal: Control IoT devices (lights, thermostat, locks) via voice/text.
- Tools:
set_light(room, state),set_temperature(value),lock_door(door). - Tech Stack: FastMCP, MQTT or local REST APIs of smart devices.
Data Analysis Assistant
- Goal: Load CSV/Excel files, run analyses, and produce charts.
- Tools:
load_dataset(path),describe_data(),plot(column_x, column_y). - Tech Stack: FastMCP,
pandas,seaborn/matplotlib.
Study Assistant
- Goal: Generate flashcards, quizzes, and spaced‑repetition schedules.
- Tools:
create_flashcards(topic),quiz_user(topic),schedule_review(topic). - Tech Stack: FastMCP,
sqlite3, optional LLM for content generation.
Each project includes actual code snippets (as shown above), example usage, and a recommended tech stack. By following the guide you’ll have a functional, locally‑run AI agent capable of understanding natural‑language requests, deciding which tools to invoke, executing them, and returning intelligent responses—all without external API keys or cloud costs.