Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

Published: (December 12, 2025 at 04:54 AM EST)
6 min read
Source: Dev.to

Source: Dev.to

Introduction

Learn how to build AI agents that can use tools, make decisions, and take actions—all running locally on your machine.

Background

  • Ollama – a local AI model runtime (similar to Docker for AI).
  • Tool calling – enables an LLM to recognize when it needs a function, select the appropriate tool, pass the correct parameters, and interpret the results.
  • MCP (Model‑Connector‑Protocol) – a standardized way for LLMs to connect with tools and data sources (think USB‑C for AI). FastMCP is a Python library that simplifies building MCP servers.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                            │
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐ │
│  │   Ollama     │◄──►│   Python     │◄──►│   FastMCP    │ │
│  │ (AI Brain)  │    │   Client     │    │   Server     │ │
│  └──────────────┘    └──────────────┘    └──────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Flow

  1. You ask a question.
  2. Python client sends it to Ollama with the list of available tools.
  3. Ollama decides whether a tool is needed.
  4. Client calls the tool via FastMCP.
  5. Results are sent back to Ollama.
  6. Ollama generates the final answer.

Ollama Setup

Linux

curl -fsSL https://ollama.ai/install.sh | sh

macOS

brew install ollama

Windows

Download from and run ollama --version.
You should see something like ollama version 0.1.x. Then start the server:

ollama serve

Keep this terminal open.

Pull a model that supports tool calling

ollama pull llama3.2   # ~2 GB download
ollama run llama3.2

Test it:

>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...

>>> /bye

Model Support Table

ModelSizeSpeedBest For
llama3.23 GBFastRecommended for this tutorial
llama3.15 GBMediumMore accurate responses
mistral4 GBFastGood general purpose
qwen2.54 GBFastMultilingual support

Models without tool support

ModelWhy Not?
codellamaBuilt only for code generation
llama2Older architecture, no tool support
phiToo small for complex tool reasoning

Attempting to use them yields: Error: does not support tools (status code: 400).

Project Setup

mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial

# Virtual environment
python -m venv myenv
source myenv/bin/activate   # Linux/macOS
# myenv\Scripts\activate      # Windows

pip install fastmcp ollama requests
python -c "import fastmcp, ollama, requests; print('✅ All packages installed!')"

Building the MCP Server

Create mcp_server.py:

# mcp_server.py
from fastmcp import FastMCP

mcp = FastMCP("My First MCP Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

@mcp.tool()
def greet(name: str) -> str:
    """Greet someone by name"""
    return f"Hello, {name}! Welcome!"

@mcp.tool()
def multiply(a: float, b: float) -> float:
    """Multiply two numbers"""
    return a * b

@mcp.tool()
def get_time() -> str:
    """Get the current time"""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

if __name__ == "__main__":
    mcp.run(transport="sse", port=8080)

Run the server:

python mcp_server.py

You should see:

INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8080
✅ Checkpoint: Your tool server is running!

Keep this terminal open.

Building the Client

Create client_ollama.py:

# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio

OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"

# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
    """Connect to MCP server and get list of available tools"""
    async with MCPClient(MCP_SERVER_URL) as mcp:
        return await mcp.get_tools()

# ------------------------------------------------------------
# Step 2: Send a user query to Ollama, letting it decide on tools
# ------------------------------------------------------------
async def ask_ollama(prompt: str, tools):
    """Send prompt + tool schema to Ollama and get a response"""
    # Ollama expects a JSON schema describing the tools
    tool_schema = {"tools": tools}
    response = await ollama.Chat(
        model=OLLAMA_MODEL,
        messages=[{"role": "user", "content": prompt}],
        options={"tool_schema": json.dumps(tool_schema)},
    )
    return response

# ------------------------------------------------------------
# Step 3: Execute any required tool calls and feed results back
# ------------------------------------------------------------
async def run():
    tools = await load_mcp_tools()
    user_prompt = input("You: ")
    response = await ask_ollama(user_prompt, tools)

    # If Ollama requests a tool call, execute it
    if "tool_calls" in response:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            for call in response["tool_calls"]:
                tool_name = call["name"]
                args = call["arguments"]
                result = await mcp.call(tool_name, **args)
                # Send the result back to Ollama (simplified)
                response = await ollama.Chat(
                    model=OLLAMA_MODEL,
                    messages=[
                        {"role": "assistant", "content": response["content"]},
                        {"role": "tool", "name": tool_name, "content": json.dumps(result)},
                    ],
                )
    print("\nAI:", response["content"])

if __name__ == "__main__":
    asyncio.run(run())

Run the client in a new terminal:

python client_ollama.py

You can now interact with the agent:

You: Hey, greet Alice and then calculate 150 + 75
AI: *Thinking... I need to use the greet tool and the add tool*
AI: Hello, Alice! Welcome! The sum of 150 and 75 is 225.

Running the System

  1. Terminal 1 – start Ollama: ollama serve
  2. Terminal 2 – run the MCP server: python mcp_server.py
  3. Terminal 3 – run the client: python client_ollama.py

How It Works (Deep Dive)

  • Tool discovery – the client fetches a JSON schema from the MCP server (/tools).
  • Prompt augmentation – the schema is sent to Ollama so the model knows which functions are available.
  • Decision – Ollama decides whether a tool call is needed based on the user query.
  • Execution – the client invokes the appropriate function via FastMCP (HTTP/SSE).
  • Result integration – the tool’s output is fed back to Ollama, which composes the final answer.

Customization

Add new tools by defining additional @mcp.tool() functions in mcp_server.py. Example:

@mcp.tool()
def get_weather(city: str) -> str:
    """Return a simple weather description for the given city."""
    # Placeholder implementation
    return f"The weather in {city} is sunny with 22°C."

Restart the server and the client will automatically discover the new tool.

Troubleshooting

ProblemSolution
Ollama cannot find the modelVerify the model name (ollama list) and that ollama serve is running.
Tool not recognizedEnsure the MCP server is running and reachable at MCP_SERVER_URL.
Port conflict (8080)Change the port in mcp_server.py and update MCP_SERVER_URL accordingly.
Python virtual environment not activatedActivate it (source myenv/bin/activate or myenv\Scripts\activate).
Model returns incorrect calculationConfirm the model you pulled supports tool calling (e.g., llama3.2).

Real Project Ideas

Email Assistant

  • Goal: Manage emails (list, read, draft, send) via natural language.
  • Tools: list_emails(), read_email(id), draft_reply(id, content), send_email(id).
  • Tech Stack: FastMCP, imaplib/smtplib, SQLite for local cache.

Personal Knowledge Base

  • Goal: Smart note‑taking with tagging, search, and summarization.
  • Tools: add_note(title, body), search_notes(query), summarize_note(id).
  • Tech Stack: FastMCP, sqlite3, optional embedding model (e.g., sentence‑transformers).

Finance Manager

  • Goal: Track expenses, generate budgets, and answer financial queries.
  • Tools: add_transaction(date, amount, category), monthly_report(month).
  • Tech Stack: FastMCP, pandas, matplotlib for visual reports.

Smart Home Controller

  • Goal: Control IoT devices (lights, thermostat, locks) via voice/text.
  • Tools: set_light(room, state), set_temperature(value), lock_door(door).
  • Tech Stack: FastMCP, MQTT or local REST APIs of smart devices.

Data Analysis Assistant

  • Goal: Load CSV/Excel files, run analyses, and produce charts.
  • Tools: load_dataset(path), describe_data(), plot(column_x, column_y).
  • Tech Stack: FastMCP, pandas, seaborn/matplotlib.

Study Assistant

  • Goal: Generate flashcards, quizzes, and spaced‑repetition schedules.
  • Tools: create_flashcards(topic), quiz_user(topic), schedule_review(topic).
  • Tech Stack: FastMCP, sqlite3, optional LLM for content generation.

Each project includes actual code snippets (as shown above), example usage, and a recommended tech stack. By following the guide you’ll have a functional, locally‑run AI agent capable of understanding natural‑language requests, deciding which tools to invoke, executing them, and returning intelligent responses—all without external API keys or cloud costs.

Back to Blog

Related posts

Read more »