Building a Containerized GenAI Chatbot with Docker, Ollama, FastAPI & ChromaDB

Published: (February 3, 2026 at 10:17 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

Introduction

Modern AI systems are not just Python scripts — they are distributed systems involving:

  • LLM engines
  • APIs
  • UI applications
  • Vector databases
  • Container orchestration

In this post I share how I built a GenAI Chatbot using Docker‑based microservices, similar to real‑world AI platforms, and the real DevOps issues I faced while building it.

System Architecture (AI Architect View)

User (Browser)
    |
    v
[ Streamlit UI ]
    |
    v
[ FastAPI Backend ]
    |
    +----> [ Ollama LLM Engine ]
    |
    +----> [ ChromaDB Vector Database ]

Why Microservices?

Separating AI components into services improves scalability, maintainability, and fault isolation.

Project Structure

genai-docker-project/

├── backend/        # FastAPI + AI logic
├── ui/             # Streamlit UI
├── docker-compose.yml
└── README.md

Docker Compose (Core of System)

version: "3.8"

services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"

  backend:
    build: ./backend
    ports:
      - "8000:8000"
    depends_on:
      - ollama
      - chroma

  chroma:
    image: chromadb/chroma
    ports:
      - "8001:8000"

  ui:
    build: ./ui
    ports:
      - "8501:8501"
    depends_on:
      - backend

Backend (FastAPI + Ollama Integration)

from fastapi import FastAPI
import requests

app = FastAPI()

OLLAMA_URL = "http://ollama:11434/api/generate"

@app.post("/ask")
def ask_ai(question: str):
    payload = {"model": "mistral", "prompt": question}
    response = requests.post(OLLAMA_URL, json=payload)
    return response.json()

UI (Streamlit)

import streamlit as st
import requests

st.title("GenAI Chatbot")

question = st.text_input("Ask a question:")

if st.button("Ask AI"):
    res = requests.post("http://backend:8000/ask", params={"question": question})
    st.write(res.json())

Architecture diagram

Real Errors & DevOps Solutions

Error 1: Docker Permission Denied

permission denied while trying to connect to the Docker daemon socket

Root cause: User was not part of the docker group.

Solution:

sudo usermod -aG docker $USER
newgrp docker

Error 2: Port Already in Use (11434)

failed to bind host port 0.0.0.0:11434: address already in use

Root cause: Ollama was already running on the host via Snap.

Solution:

sudo snap stop ollama
sudo snap disable ollama

Alternatively, change the Docker port mapping, e.g.:

ports:
  - "21434:11434"

Error 3: Model Not Found in Ollama

"model 'mistral' not found"

Root cause: The Ollama container did not have the model downloaded.

Solution:

docker exec -it genai-docker-project-ollama-1 bash
ollama pull mistral

Error 4: Container Networking Issues

Problem: Backend could not connect to Ollama.

Root cause: Using localhost instead of the container DNS name.

Fix: Set the URL to the service name:

OLLAMA_URL = "http://ollama:11434/api/generate"

Key Learnings (AI + DevOps)

  • AI systems are distributed systems.
  • Docker is essential for reproducible ML environments.
  • LLM platforms require careful networking and resource management.
  • MLOps bridges DevOps and AI.
Back to Blog

Related posts

Read more »