Tabby Team Server Setup 2026: Self-Host Code Completion

Published: 3 days ago (June 10, 2026 at 03:02 AM EDT)

5 min read

Source: Dev.to

This article was originally published on aifoss.dev TL;DR: Tabby v0.32.0 is an Apache 2.0-licensed code completion server — one GPU box on your network, every developer connects to it. The full team deployment takes under an hour if you have Ubuntu and an NVIDIA GPU ready. The math favors self-hosting once your team hits 8–10 developers. What you’ll have running after this guide: Tabby v0.32.0 running in Docker, exposed over HTTPS via nginx + Let’s Encrypt TLS Per-developer API tokens managed through Tabby’s admin panel VS Code and JetBrains IDEs connected with inline completions and chat Honest take: For a 5–15 developer team with a dedicated GPU server, Tabby is the best Copilot replacement available — purpose-built for team use, not retrofitted from a single-user tool. Under 4 developers, the ops overhead isn’t worth it; stick with Copilot. What you need before starting: Ubuntu 22.04 LTS server (physical or VM — cloud VMs with GPU passthrough work too) NVIDIA GPU with driver ≥ 535 installed Docker Engine and the NVIDIA Container Toolkit A domain name with an A record pointing to the server’s public IP Ports 80 and 443 open in your firewall/security group GPU and model pairing by team size — the table below reflects what works in practice as of mid-2026:

GPU VRAM Team size Recommended model

RTX 3060 / RTX 3070

12 GB 2–4 devs Qwen/Qwen2.5-Coder-7B

RTX 3090 / 4070 Ti 24 GB 5–10 devs

Qwen/Qwen2.5-Coder-7B + chat model

RTX 4090 24 GB 10–15 devs

Qwen/Qwen2.5-Coder-7B + Qwen2-7B-Instruct

Don’t have a dedicated GPU server yet? RunPod rents dedicated NVIDIA instances on monthly contracts — a reasonable staging ground before committing to hardware. Hardware build options at runaihome.com if you’re planning a permanent server. If Docker isn’t installed: curl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER newgrp docker

NVIDIA Container Toolkit (required for GPU passthrough to the container): curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey |
sudo gpg —dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |
sed ‘s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g’ |
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure —runtime=docker sudo systemctl restart docker

Verify GPU access: docker run —rm —gpus all nvidia/cuda:12.3.0-base-ubuntu22.04 nvidia-smi If nvidia-smi output shows your GPU, you’re ready. Create the deployment directory and compose file: sudo mkdir -p /opt/tabby sudo tee /opt/tabby/docker-compose.yml > /dev/null << ‘EOF’ services: tabby: image: tabbyml/tabby:0.32.0 command: serve —model Qwen/Qwen2.5-Coder-7B —device cuda —port 8080 volumes: - tabby_data:/data ports: - “127.0.0.1:8080:8080” deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] restart: unless-stopped

volumes: tabby_data: EOF

Pin the image to 0.32.0 rather than latest. Tabby’s model API has changed between minor versions, and a mid-sprint image update that breaks IDE plugins is annoying to debug. Start the server: cd /opt/tabby docker compose up -d docker compose logs -f tabby

The first run downloads the model — roughly 4–7 GB for Qwen2.5-Coder-7B. Subsequent starts use the cached volume and take under 30 seconds. Once you see Listening on 0.0.0.0:8080 in the logs, test it: curl http://localhost:8080/v1/health

Expected: {“status”:“ok”,“model”:“Qwen/Qwen2.5-Coder-7B”}

Install nginx and Certbot: sudo apt-get install -y nginx certbot python3-certbot-nginx

Create /etc/nginx/sites-available/tabby: server { listen 80; server_name tabby.yourdomain.com; return 301 https://$host$request_uri; }

server { listen 443 ssl; server_name tabby.yourdomain.com;

ssl_certificate     /etc/letsencrypt/live/tabby.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/tabby.yourdomain.com/privkey.pem;
ssl_protocols       TLSv1.2 TLSv1.3;
ssl_ciphers         HIGH:!aNULL:!MD5;

location / {
    proxy_pass         http://127.0.0.1:8080;
    proxy_http_version 1.1;

    # WebSocket required for Tabby's answer engine streaming
    proxy_set_header   Upgrade $http_upgrade;
    proxy_set_header   Connection "upgrade";

    proxy_set_header   Host $host;
    proxy_set_header   X-Real-IP $remote_addr;
    proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header   X-Forwarded-Proto $scheme;
    proxy_read_timeout 300s;
}

}

Enable and get the certificate: sudo ln -s /etc/nginx/sites-available/tabby /etc/nginx/sites-enabled/ sudo nginx -t sudo certbot —nginx -d tabby.yourdomain.com sudo systemctl reload nginx

Certbot installs a systemd timer for automatic renewal — confirm it with sudo certbot renew —dry-run. Common issue: 502 Bad Gateway right after nginx starts almost always means Tabby is still downloading the model. Watch docker compose logs tabby until you see the health check pass. WebSocket note: Skipping the Upgrade and Connection headers causes Tabby’s chat streaming to hang silently. The IDE plugins don’t depend on WebSocket for basic completions, but the answer engine does. Open https://tabby.yourdomain.com in a browser. The first visit prompts you to create an admin account — do this before sharing the URL. The first registrant gets admin rights; everyone after that registers as a standard user. From the admin panel at /admin: Users → Invite: Generate invitation links for each team member. Each link is single-use and expires. Tokens: Each developer logs in and generates their own token under Settings → Tokens. They copy this once — Tabby doesn’t show full token values again. Admin visibility: You see all active tokens, which user owns them, and last-used timestamps. Revoke a token instantly when someone leaves the team. There’s no built-in token rotation schedule as of v0.32.0. Worth noting in your team runbook: tokens don’t expire unless manually revoked. Install the Tabby extension from the VS Code Marketplace (publisher: TabbyML) Open Command Palette (Ctrl+Shift+P) → Tabby: Connect to Server

Enter your server URL: https://tabby.yourdomain.com

Paste the token when prompted Or set it directly in settings.json: { “tabby.api.endpoint”: “https://tabby.yourdomain.com”, “tabby.api.token”: “your-token-here” }

A green Tabby icon in the VS Code status bar confirms a live connection. Gray icon means connection failed — check the token and that the server URL is reachable. File → Settings → Plugins → Marketplace — search Tabby, install, restart the IDE File → Settings → Tools → Tabby Set Server endpoint: https://tabby.yourdomain.com

Set Authentication token: paste the developer’s token B

Tabby Team Server Setup 2026: Self-Host Code Completion

Expected: {“status”:“ok”,“model”:“Qwen/Qwen2.5-Coder-7B”}

Related posts

A Domain Logger Port: Decoupling From PSR-3 Without Losing Context

Retries and Circuit Breakers Belong in the Adapter, Not Your Use Case

Evidence Beats Certainty: Why My Classifier Refuses to Pretend Every Product Has an Answer

Persisting One Aggregate Across Multiple Tables, ORM-Agnostic