Production Hardening
Reverse proxy, centralized auth, automated backups, and the complete unified stack.
Completed Parts 1–7, domain name, basic Linux admin
45–60 minutes
8GB ($40/mo) for the full production stack
Introduction
You've built a comprehensive AI platform across seven guides. Now it's time to make it production-ready: secure, resilient, monitored, and optimized. This final part transforms your development setup into infrastructure you can rely on.
Nginx Reverse Proxy
Route all services through a single Nginx instance with subdomain routing:
sudo apt install -y nginx# Open WebUI — chat.yourdomain.com
server {
server_name chat.yourdomain.com;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support for streaming
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 300s;
}
}
# AnythingLLM — apps.yourdomain.com
server {
server_name apps.yourdomain.com;
location / {
proxy_pass http://localhost:3001;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
# Tabby — code.yourdomain.com
server {
server_name code.yourdomain.com;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
# n8n — auto.yourdomain.com
server {
server_name auto.yourdomain.com;
location / {
proxy_pass http://localhost:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket for n8n
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
# Qdrant Dashboard — vectors.yourdomain.com
server {
server_name vectors.yourdomain.com;
location / {
proxy_pass http://localhost:6333;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}sudo ln -s /etc/nginx/sites-available/ai-stack /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginxTLS with Let's Encrypt
sudo apt install -y certbot python3-certbot-nginx
# Get certificates for all subdomains
sudo certbot --nginx -d chat.yourdomain.com -d apps.yourdomain.com -d code.yourdomain.com -d auto.yourdomain.com -d vectors.yourdomain.comCertbot automatically modifies your Nginx configs and sets up auto-renewal.
Security Headers
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;# Verify auto-renewal
sudo certbot renew --dry-runCentralized Authentication
For single sign-on across all services, consider deploying Authelia or Authentik as an authentication gateway:
- Authelia — Lightweight, YAML-configured, perfect for small teams. See our Authelia guide
- Authentik — Full-featured IdP with admin dashboard. See our Authentik guide
Both support MFA/2FA. Configure each AI service to authenticate through the central gateway for a unified login experience.
Firewall Configuration
# Reset UFW
sudo ufw reset
# Default policies
sudo ufw default deny incoming
sudo ufw default allow outgoing
# Allow SSH
sudo ufw allow 22/tcp
# Allow HTTP/HTTPS only (Nginx handles routing)
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
# Enable firewall
sudo ufw enable
sudo ufw status verboseAll application ports (3000, 3001, 5678, 6333, 8080, 11434) are now inaccessible from the internet — only Nginx on 80/443 can reach them internally.
Fail2ban for Brute-Force Protection
sudo apt install -y fail2ban
# Configure Nginx jail
sudo tee /etc/fail2ban/jail.local << 'EOF'
[nginx-http-auth]
enabled = true
port = http,https
filter = nginx-http-auth
logpath = /var/log/nginx/error.log
maxretry = 5
bantime = 3600
EOF
sudo systemctl restart fail2banAutomated Backups
Backup strategy for each component:
| Component | Data | Frequency | Size |
|---|---|---|---|
| Ollama models | ~/.ollama/models | Weekly (rarely change) | 2–10 GB |
| Qdrant vectors | Docker volume | Daily (incremental) | Varies |
| Open WebUI | Docker volume | Daily | Small |
| n8n workflows | Docker volume | Daily | Small |
| AnythingLLM | Docker volume | Daily | Small |
#!/bin/bash
# Automated backup script for the AI stack
BKDIR="/backups/ai-stack/$(date +%Y-%m-%d)"
mkdir -p "$BKDIR"
echo "=== AI Stack Backup: $(date) ==="
# Backup Docker volumes
for vol in open-webui-data qdrant-data anythingllm-data n8n-data tabby-data; do
echo "Backing up $vol..."
docker run --rm \
-v "$vol":/source:ro \
-v "$BKDIR":/backup \
alpine tar czf "/backup/$vol.tar.gz" -C /source .
done
# Backup Ollama models (weekly only)
if [ "$(date +%u)" = "1" ]; then
echo "Weekly: Backing up Ollama models..."
tar czf "$BKDIR/ollama-models.tar.gz" -C /home/$USER .ollama/models
fi
# Backup Nginx configs
cp -r /etc/nginx/sites-available "$BKDIR/nginx-configs"
# Retention: keep 30 days
find /backups/ai-stack -maxdepth 1 -mtime +30 -exec rm -rf {} +
echo "=== Backup complete: $BKDIR ==="chmod +x /usr/local/bin/backup-ai-stack.sh
# Add to crontab (daily at 3 AM)
echo "0 3 * * * /usr/local/bin/backup-ai-stack.sh >> /var/log/ai-stack-backup.log 2>&1" | sudo tee -a /etc/crontabResource Tuning & Monitoring
Memory Allocation Strategy (8GB VPS)
| Service | RAM Allocation | Notes |
|---|---|---|
| Ollama + Model | 3–4 GB | Largest consumer; varies by model |
| Open WebUI | ~500 MB | Lightweight |
| Qdrant | ~500 MB | Depends on vector count |
| AnythingLLM | ~300 MB | Lightweight |
| Tabby | 1–2 GB | Depends on code model |
| n8n | ~300 MB | Grows with active workflows |
| Nginx + OS | ~500 MB | Overhead |
Docker Resource Limits
Add memory limits to prevent any single service from consuming all RAM:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256MQuick Monitoring with Uptime Kuma
For lightweight monitoring, add Uptime Kuma to check service health:
uptime-kuma:
image: louislam/uptime-kuma:latest
container_name: uptime-kuma
restart: unless-stopped
ports:
- "3002:3001"
volumes:
- uptime-kuma-data:/app/dataConfigure health checks for each service endpoint (Ollama :11434, Open WebUI :3000, Qdrant :6333, etc.).
The Complete Stack
Here's the unified Docker Compose file combining all services from Parts 1–7 with production settings:
version: "3.8"
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
ports:
- "127.0.0.1:3000:8080"
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
- WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
- ENABLE_SIGNUP=false
volumes:
- open-webui-data:/app/backend/data
extra_hosts:
- "host.docker.internal:host-gateway"
deploy:
resources:
limits:
memory: 512M
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
restart: unless-stopped
ports:
- "127.0.0.1:6333:6333"
environment:
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
volumes:
- qdrant-data:/qdrant/storage
deploy:
resources:
limits:
memory: 512M
anythingllm:
image: mintplexlabs/anythingllm:latest
container_name: anythingllm
restart: unless-stopped
ports:
- "127.0.0.1:3001:3001"
environment:
- LLM_PROVIDER=ollama
- OLLAMA_BASE_PATH=http://host.docker.internal:11434
- EMBEDDING_ENGINE=ollama
- EMBEDDING_MODEL_PREF=nomic-embed-text
- VECTOR_DB=qdrant
- QDRANT_ENDPOINT=http://qdrant:6333
- QDRANT_API_KEY=${QDRANT_API_KEY}
volumes:
- anythingllm-data:/app/server/storage
extra_hosts:
- "host.docker.internal:host-gateway"
deploy:
resources:
limits:
memory: 512M
tabby:
image: tabbyml/tabby:latest
container_name: tabby
restart: unless-stopped
command: serve --model StarCoder-1B --device cpu
ports:
- "127.0.0.1:8080:8080"
volumes:
- tabby-data:/data
environment:
- TABBY_DISABLE_USAGE_COLLECTION=1
deploy:
resources:
limits:
memory: 2G
n8n:
image: n8nio/n8n:latest
container_name: n8n
restart: unless-stopped
ports:
- "127.0.0.1:5678:5678"
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=${N8N_USER}
- N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
- WEBHOOK_URL=https://auto.yourdomain.com/
- GENERIC_TIMEZONE=UTC
volumes:
- n8n-data:/home/node/.n8n
extra_hosts:
- "host.docker.internal:host-gateway"
deploy:
resources:
limits:
memory: 512M
volumes:
open-webui-data:
qdrant-data:
anythingllm-data:
tabby-data:
n8n-data:WEBUI_SECRET_KEY=your-generated-secret-key
QDRANT_API_KEY=your-generated-api-key
N8N_USER=admin
N8N_PASSWORD=your-secure-passwordNotice: all ports bind to 127.0.0.1 — they're only accessible via the Nginx reverse proxy, not directly from the internet.
Maintenance Playbook
Monthly Checklist
docker compose pull && docker compose up -dsudo apt update && sudo apt upgradesudo certbot renew --dry-rundf -hsudo fail2ban-client status🎉 Series Complete!
You've built a complete, production-ready, private AI platform:
| Part | Service | Replaces |
|---|---|---|
| 1 | Ollama | OpenAI API ($50–200/mo) |
| 2 | Open WebUI | ChatGPT Team ($125/mo) |
| 3 | Qdrant RAG | Pinecone ($70+/mo) |
| 4 | AnythingLLM | Custom AI apps ($) |
| 5 | Tabby | GitHub Copilot ($95/mo) |
| 6 | CrewAI | AI agent platforms ($) |
| 7 | n8n + Ollama | Zapier AI ($50+/mo) |
| Total | 8GB RamNode VPS | $40/mo vs $390–690+/mo |
Zero data exposure. Unlimited usage. Complete sovereignty over your AI infrastructure.
