Local AI Tools: Getting Started with Ollama (Tool 1)

Published: (December 9, 2025 at 07:15 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

Overview

When discussing accessible local AI tools, one of the most popular options today is Ollama. It is known for its simplicity, reliability, and availability across all major operating systems (Windows, Linux, and macOS). After downloading the installer from the official Ollama website and completing a brief registration, you can run the desktop application and choose from a collection of pre‑trained AI models. A full list of models is available on the Ollama website, and many can be downloaded directly through the installer.

Ollama also offers a paid subscription for cloud‑based usage, but the desktop version remains the easiest entry point.

Docker Deployment

For developers who need more control or want to integrate Ollama into automated workflows, the tool can be run inside a Docker container.

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama run llama3.2

The container starts a local server accessible at http://localhost:11434/.

Note: The Docker version does not include a web interface; interaction is performed exclusively through the API.

API Usage

The Ollama API is documented on the website and supports standard POST requests for:

  • Listing available models
  • Submitting a prompt
  • Receiving responses programmatically (streamed)

Example: Streaming a Prompt

async function askModel() {
  const response = await fetch("http://localhost:11434/api/generate", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "llama3.2",
      prompt: "Who are you?"
    })
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder("utf-8");

  // Output a stream response
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    console.log(decoder.decode(value));
  }
}

askModel();

Sample streamed output:

{
  "model": "llama3.2",
  "created_at": "2025-12-09T10:20:54.622934272Z",
  "response": "I",
  "done": false
}
{
  "model": "llama3.2",
  "created_at": "2025-12-09T10:20:54.761152451Z",
  "response": "'m",
  "done": false
}
...

Building a Custom Docker Image

Creating a Dockerfile allows you to pre‑load a specific model during build time, saving time when the container starts.

Dockerfile

FROM ollama/ollama:latest
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
EXPOSE 11434
ENTRYPOINT ["/entrypoint.sh"]

entrypoint.sh

#!/bin/bash
ollama serve &
sleep 3
ollama pull llama3.1
wait

Build and Run

docker build -t my-ollama .
docker run -p 11434:11434 my-ollama

This setup makes Ollama a plug‑and‑play module that can be incorporated into web services, automation pipelines, or local development environments.

Integration with Development Tools

The Ollama documentation provides examples for connecting the service to:

  • n8n
  • Visual Studio Code
  • JetBrains IDEs
  • Codex and other workflow tools

These integrations help embed Ollama into broader automation or development pipelines.

Conclusion

Ollama’s main advantage is its simplicity: easy installation, cross‑platform availability, and a straightforward API. It’s an excellent starting point for users who want to experiment with AI models locally without complex setup.

However, for developers requiring deeper customization, advanced configuration, or high‑performance integration into complex workflows, alternative tools may offer better efficiency and flexibility. Future articles in this series will explore those options.

Back to Blog

Related posts

Read more »