Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama

Published: (February 20, 2026 at 12:01 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

Why Combine Ollama and n8n?

  • Zero latency – no external API round‑trips.
  • Privacy – logs, secrets, and prompts never leave your hardware.
  • No subscriptions – one‑time hardware cost, zero monthly fees.
  • Full control – run any model you like (Llama 3.x, Mistral, DeepSeek, etc.).

Supported environment

  • Any modern Linux distribution (Ubuntu 24.04+ or Debian 13 recommended).
  • Ollama – the simplest way to run LLMs locally.
  • n8n – “Zapier for self‑hosters” with built‑in AI nodes.
  • Docker – for easy deployment and isolation.

Installing Ollama

curl -fsSL https://ollama.com/install.sh | sh

Verify the installation and pull a versatile model (e.g., Llama 3):

ollama pull llama3
ollama run llama3 "Hello, world!"

Deploying n8n with Docker Compose

Create a docker-compose.yml file:

version: '3.8'

services:
  n8n:
    image: n8nio/n8n:latest
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=localhost
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
    volumes:
      - n8n_data:/home/node/.local/share/n8n
    # Allow n8n to reach Ollama on the host
    extra_hosts:
      - "host.docker.internal:host-gateway"

volumes:
  n8n_data:

Start the stack:

docker compose up -d

Open n8n at .

Adding an Ollama Node

  1. Add an Ollama node to your workflow.
  2. Configure credentials: set the URL to http://host.docker.internal:11434.
  3. Select your model (e.g., llama3).
  4. Connect the node to a trigger (HTTP request, Cron, etc.).

Example Workflow: Log Summarization Email

  1. Node 1 – Execute Command

    tail -n 100 /var/log/syslog
  2. Node 2 – Ollama

    Prompt: “Summarize these logs and highlight any security warnings or critical errors.”

  3. Node 3 – Email / Discord

    Send the generated summary to your preferred channel.

Performance Tips

  • GPU acceleration – Install nvidia-container-toolkit so Docker can use CUDA.
  • Model quantization – 4‑bit or 6‑bit quantizations provide a good speed‑accuracy trade‑off.
  • VRAM requirements
    • 7 B–8 B models: ~8 GB VRAM is sufficient.
    • 70 B models: 24 GB + VRAM (or a high‑end workstation such as Mac Studio).

Further Reading

Self‑hosting your AI isn’t just a technical choice; it’s a step toward reclaiming ownership of your tools. If you build something cool with this stack, feel free to share it in the comments.

Happy hacking!

0 views
Back to Blog

Related posts

Read more »

Warm Introduction

Introduction Hello everyone! I'm fascinated by the deep tech discussions here. It's truly amazing to see the community thrive. Project Overview I'm passionate...