Building a Containerized GenAI Chatbot with Docker, Ollama, FastAPI & ChromaDB

Published: 3 months ago (February 3, 2026 at 10:17 AM EST)

3 min read

Source: Dev.to

Source: Dev.to

Introduction

Modern AI systems are not just Python scripts — they are distributed systems involving:

LLM engines
APIs
UI applications
Vector databases
Container orchestration

In this post I share how I built a GenAI Chatbot using Docker‑based microservices, similar to real‑world AI platforms, and the real DevOps issues I faced while building it.

System Architecture (AI Architect View)

User (Browser)
    |
    v
[ Streamlit UI ]
    |
    v
[ FastAPI Backend ]
    |
    +----> [ Ollama LLM Engine ]
    |
    +----> [ ChromaDB Vector Database ]

Why Microservices?

Separating AI components into services improves scalability, maintainability, and fault isolation.

Project Structure

genai-docker-project/
│
├── backend/        # FastAPI + AI logic
├── ui/             # Streamlit UI
├── docker-compose.yml
└── README.md

Docker Compose (Core of System)

version: "3.8"

services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"

  backend:
    build: ./backend
    ports:
      - "8000:8000"
    depends_on:
      - ollama
      - chroma

  chroma:
    image: chromadb/chroma
    ports:
      - "8001:8000"

  ui:
    build: ./ui
    ports:
      - "8501:8501"
    depends_on:
      - backend

Backend (FastAPI + Ollama Integration)

from fastapi import FastAPI
import requests

app = FastAPI()

OLLAMA_URL = "http://ollama:11434/api/generate"

@app.post("/ask")
def ask_ai(question: str):
    payload = {"model": "mistral", "prompt": question}
    response = requests.post(OLLAMA_URL, json=payload)
    return response.json()

UI (Streamlit)

import streamlit as st
import requests

st.title("GenAI Chatbot")

question = st.text_input("Ask a question:")

if st.button("Ask AI"):
    res = requests.post("http://backend:8000/ask", params={"question": question})
    st.write(res.json())

Architecture diagram

Real Errors & DevOps Solutions

Error 1: Docker Permission Denied

permission denied while trying to connect to the Docker daemon socket

Root cause: User was not part of the docker group.

Solution:

sudo usermod -aG docker $USER
newgrp docker

Error 2: Port Already in Use (11434)

failed to bind host port 0.0.0.0:11434: address already in use

Root cause: Ollama was already running on the host via Snap.

Solution:

sudo snap stop ollama
sudo snap disable ollama

Alternatively, change the Docker port mapping, e.g.:

ports:
  - "21434:11434"

Error 3: Model Not Found in Ollama

"model 'mistral' not found"

Root cause: The Ollama container did not have the model downloaded.

Solution:

docker exec -it genai-docker-project-ollama-1 bash
ollama pull mistral

Error 4: Container Networking Issues

Problem: Backend could not connect to Ollama.

Root cause: Using localhost instead of the container DNS name.

Fix: Set the URL to the service name:

OLLAMA_URL = "http://ollama:11434/api/generate"

Key Learnings (AI + DevOps)

AI systems are distributed systems.
Docker is essential for reproducible ML environments.
LLM platforms require careful networking and resource management.
MLOps bridges DevOps and AI.

Building a Containerized GenAI Chatbot with Docker, Ollama, FastAPI & ChromaDB

Introduction

System Architecture (AI Architect View)

Why Microservices?

Project Structure

Docker Compose (Core of System)

Backend (FastAPI + Ollama Integration)

UI (Streamlit)

Real Errors & DevOps Solutions

Error 1: Docker Permission Denied

Error 2: Port Already in Use (11434)

Error 3: Model Not Found in Ollama

Error 4: Container Networking Issues

Key Learnings (AI + DevOps)

Related posts

Hello again, here's a LangChain Ollama helper sheet :)

AI News Roundup: ChatGPT Ads Testing, the AI Super Bowl, and India’s Sovereign Models

OpenAI's new Codex app hits 1M+ downloads in first week — but limits may be coming to free and Go users

Imagen 4 vs Ideogram vs SD3.5: Which Image Model Fits Your Product Roadmap?