Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

Published: 6 days ago (December 12, 2025 at 04:54 AM EST)

6 min read

Source: Dev.to

Source: Dev.to

Introduction

Learn how to build AI agents that can use tools, make decisions, and take actions—all running locally on your machine.

Background

Ollama – a local AI model runtime (similar to Docker for AI).
Tool calling – enables an LLM to recognize when it needs a function, select the appropriate tool, pass the correct parameters, and interpret the results.
MCP (Model‑Connector‑Protocol) – a standardized way for LLMs to connect with tools and data sources (think USB‑C for AI). FastMCP is a Python library that simplifies building MCP servers.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                            │
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐ │
│  │   Ollama     │◄──►│   Python     │◄──►│   FastMCP    │ │
│  │ (AI Brain)  │    │   Client     │    │   Server     │ │
│  └──────────────┘    └──────────────┘    └──────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Flow

You ask a question.
Python client sends it to Ollama with the list of available tools.
Ollama decides whether a tool is needed.
Client calls the tool via FastMCP.
Results are sent back to Ollama.
Ollama generates the final answer.

Ollama Setup

Linux

curl -fsSL https://ollama.ai/install.sh | sh

macOS

brew install ollama

Windows

Download from and run ollama --version.
You should see something like ollama version 0.1.x. Then start the server:

ollama serve

Keep this terminal open.

Pull a model that supports tool calling

ollama pull llama3.2   # ~2 GB download
ollama run llama3.2

Test it:

>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...

>>> /bye

Model Support Table

Model	Size	Speed	Best For
llama3.2	3 GB	Fast	Recommended for this tutorial
llama3.1	5 GB	Medium	More accurate responses
mistral	4 GB	Fast	Good general purpose
qwen2.5	4 GB	Fast	Multilingual support

Models without tool support

Model	Why Not?
codellama	Built only for code generation
llama2	Older architecture, no tool support
phi	Too small for complex tool reasoning

Attempting to use them yields: Error: does not support tools (status code: 400).

Project Setup

mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial

# Virtual environment
python -m venv myenv
source myenv/bin/activate   # Linux/macOS
# myenv\Scripts\activate      # Windows

pip install fastmcp ollama requests
python -c "import fastmcp, ollama, requests; print('✅ All packages installed!')"

Building the MCP Server

Create mcp_server.py:

# mcp_server.py
from fastmcp import FastMCP

mcp = FastMCP("My First MCP Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

@mcp.tool()
def greet(name: str) -> str:
    """Greet someone by name"""
    return f"Hello, {name}! Welcome!"

@mcp.tool()
def multiply(a: float, b: float) -> float:
    """Multiply two numbers"""
    return a * b

@mcp.tool()
def get_time() -> str:
    """Get the current time"""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

if __name__ == "__main__":
    mcp.run(transport="sse", port=8080)

Run the server:

python mcp_server.py

You should see:

INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8080
✅ Checkpoint: Your tool server is running!

Keep this terminal open.

Building the Client

Create client_ollama.py:

# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio

OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"

# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
    """Connect to MCP server and get list of available tools"""
    async with MCPClient(MCP_SERVER_URL) as mcp:
        return await mcp.get_tools()

# ------------------------------------------------------------
# Step 2: Send a user query to Ollama, letting it decide on tools
# ------------------------------------------------------------
async def ask_ollama(prompt: str, tools):
    """Send prompt + tool schema to Ollama and get a response"""
    # Ollama expects a JSON schema describing the tools
    tool_schema = {"tools": tools}
    response = await ollama.Chat(
        model=OLLAMA_MODEL,
        messages=[{"role": "user", "content": prompt}],
        options={"tool_schema": json.dumps(tool_schema)},
    )
    return response

# ------------------------------------------------------------
# Step 3: Execute any required tool calls and feed results back
# ------------------------------------------------------------
async def run():
    tools = await load_mcp_tools()
    user_prompt = input("You: ")
    response = await ask_ollama(user_prompt, tools)

    # If Ollama requests a tool call, execute it
    if "tool_calls" in response:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            for call in response["tool_calls"]:
                tool_name = call["name"]
                args = call["arguments"]
                result = await mcp.call(tool_name, **args)
                # Send the result back to Ollama (simplified)
                response = await ollama.Chat(
                    model=OLLAMA_MODEL,
                    messages=[
                        {"role": "assistant", "content": response["content"]},
                        {"role": "tool", "name": tool_name, "content": json.dumps(result)},
                    ],
                )
    print("\nAI:", response["content"])

if __name__ == "__main__":
    asyncio.run(run())

Run the client in a new terminal:

python client_ollama.py

You can now interact with the agent:

You: Hey, greet Alice and then calculate 150 + 75
AI: *Thinking... I need to use the greet tool and the add tool*
AI: Hello, Alice! Welcome! The sum of 150 and 75 is 225.

Running the System

Terminal 1 – start Ollama: ollama serve
Terminal 2 – run the MCP server: python mcp_server.py
Terminal 3 – run the client: python client_ollama.py

How It Works (Deep Dive)

Tool discovery – the client fetches a JSON schema from the MCP server (/tools).
Prompt augmentation – the schema is sent to Ollama so the model knows which functions are available.
Decision – Ollama decides whether a tool call is needed based on the user query.
Execution – the client invokes the appropriate function via FastMCP (HTTP/SSE).
Result integration – the tool’s output is fed back to Ollama, which composes the final answer.

Customization

Add new tools by defining additional @mcp.tool() functions in mcp_server.py. Example:

@mcp.tool()
def get_weather(city: str) -> str:
    """Return a simple weather description for the given city."""
    # Placeholder implementation
    return f"The weather in {city} is sunny with 22°C."

Restart the server and the client will automatically discover the new tool.

Troubleshooting

Problem	Solution
Ollama cannot find the model	Verify the model name (`ollama list`) and that `ollama serve` is running.
Tool not recognized	Ensure the MCP server is running and reachable at `MCP_SERVER_URL`.
Port conflict (8080)	Change the port in `mcp_server.py` and update `MCP_SERVER_URL` accordingly.
Python virtual environment not activated	Activate it (`source myenv/bin/activate` or `myenv\Scripts\activate`).
Model returns incorrect calculation	Confirm the model you pulled supports tool calling (e.g., `llama3.2`).

Real Project Ideas

Email Assistant

Goal: Manage emails (list, read, draft, send) via natural language.
Tools: list_emails(), read_email(id), draft_reply(id, content), send_email(id).
Tech Stack: FastMCP, imaplib/smtplib, SQLite for local cache.

Personal Knowledge Base

Goal: Smart note‑taking with tagging, search, and summarization.
Tools: add_note(title, body), search_notes(query), summarize_note(id).
Tech Stack: FastMCP, sqlite3, optional embedding model (e.g., sentence‑transformers).

Finance Manager

Goal: Track expenses, generate budgets, and answer financial queries.
Tools: add_transaction(date, amount, category), monthly_report(month).
Tech Stack: FastMCP, pandas, matplotlib for visual reports.

Smart Home Controller

Goal: Control IoT devices (lights, thermostat, locks) via voice/text.
Tools: set_light(room, state), set_temperature(value), lock_door(door).
Tech Stack: FastMCP, MQTT or local REST APIs of smart devices.

Data Analysis Assistant

Goal: Load CSV/Excel files, run analyses, and produce charts.
Tools: load_dataset(path), describe_data(), plot(column_x, column_y).
Tech Stack: FastMCP, pandas, seaborn/matplotlib.

Study Assistant

Goal: Generate flashcards, quizzes, and spaced‑repetition schedules.
Tools: create_flashcards(topic), quiz_user(topic), schedule_review(topic).
Tech Stack: FastMCP, sqlite3, optional LLM for content generation.

Each project includes actual code snippets (as shown above), example usage, and a recommended tech stack. By following the guide you’ll have a functional, locally‑run AI agent capable of understanding natural‑language requests, deciding which tools to invoke, executing them, and returning intelligent responses—all without external API keys or cloud costs.