Deploy Kokoro TTS on a RamNode VPS

An OpenAI-compatible speech endpoint backed by the 82M-parameter Kokoro model — CPU-only inference, persistent across reboots, behind nginx + TLS with API key auth.

At a Glance

Project	Kokoro-82M via Kokoro-FastAPI
License	Apache 2.0
Recommended Plan	RamNode Premium NVMe 4 vCPU / 8 GB
OS	Ubuntu 24.04 LTS
Estimated Setup Time	30 minutes

Provision + Hardening

Create user + base

adduser kokoro && usermod -aG sudo kokoro
rsync --archive --chown=kokoro:kokoro ~/.ssh /home/kokoro

sudo systemctl reload ssh
sudo apt update && sudo apt upgrade -y
sudo apt install -y ufw fail2ban curl ca-certificates gnupg unattended-upgrades

Firewall

sudo ufw default deny incoming && sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp && sudo ufw allow 443/tcp
sudo ufw enable
sudo systemctl enable --now fail2ban

Install Docker

Add repo + install

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER && newgrp docker

Deploy Kokoro-FastAPI

Project layout

mkdir -p ~/kokoro && cd ~/kokoro

docker-compose.yml

services:
  kokoro:
    image: ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4
    container_name: kokoro-tts
    restart: unless-stopped
    ports:
      - "127.0.0.1:8880:8880"
    environment:
      - ONNX_NUM_THREADS=4
      - ONNX_INTER_OP_THREADS=2
      - PYTORCH_NUM_THREADS=4
    volumes:
      - kokoro-models:/app/api/src/models
      - kokoro-voices:/app/api/src/voices
      - kokoro-temp:/tmp/kokoro
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8880/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      resources:
        limits:
          memory: 4G

volumes:
  kokoro-models:
  kokoro-voices:
  kokoro-temp:

Pull and start

docker compose pull
docker compose up -d
docker compose logs -f

Verify the Deployment

Health + sample synthesis

curl http://127.0.0.1:8880/health

curl -X POST http://127.0.0.1:8880/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","input":"RamNode hosting, now with synthetic speech.","voice":"af_bella","response_format":"mp3"}' \
  --output test.mp3

Nginx Reverse Proxy + TLS + API Key Auth

Install nginx + certbot

sudo apt install -y nginx certbot python3-certbot-nginx

/etc/nginx/sites-available/kokoro

upstream kokoro_backend {
  server 127.0.0.1:8880;
  keepalive 32;
}
limit_req_zone $binary_remote_addr zone=kokoro_api:10m rate=10r/s;
limit_conn_zone $binary_remote_addr zone=kokoro_conn:10m;

server {
  listen 80;
  server_name tts.example.com;
  location / { return 301 https://$host$request_uri; }
}
server {
  listen 443 ssl http2;
  server_name tts.example.com;
  server_tokens off;
  add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
  client_max_body_size 4M;
  proxy_read_timeout 300s;
  proxy_buffering off;
  limit_req zone=kokoro_api burst=20 nodelay;
  limit_conn kokoro_conn 5;

  if ($valid_kokoro_key = 0) { return 401; }

  location / {
    proxy_pass http://kokoro_backend;
    proxy_http_version 1.1;
    proxy_set_header Host $host;
    proxy_set_header Connection "";
  }
}

/etc/nginx/conf.d/kokoro-keys.conf

map $http_authorization $valid_kokoro_key {
  default                                  0;
  "Bearer REPLACE_WITH_LONG_RANDOM_STRING" 1;
  "Bearer SECOND_KEY_FOR_ANOTHER_CLIENT"   1;
}

Enable + cert

sudo ln -s /etc/nginx/sites-available/kokoro /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d tts.example.com --redirect --agree-tos -m you@example.com -n

Restrict the Web UI / Docs

Allowlist /web /docs /debug

location /web {
  allow 203.0.113.42;
  deny all;
  proxy_pass http://kokoro_backend;
  proxy_set_header Host $host;
}
location /docs { allow 203.0.113.42; deny all; proxy_pass http://kokoro_backend; }
location /debug { allow 203.0.113.42; deny all; proxy_pass http://kokoro_backend; }

CPU Tuning

Set ONNX_NUM_THREADS = physical core count
Set ONNX_INTER_OP_THREADS = half of that
For long-form, route to a dedicated container with its own queue
Watch with docker stats kokoro-tts

Voices

Naming: [lang][gender]_name (e.g. af_bella, bm_george, jf_*, zf_*). Blend with af_bella+af_sky or weight as af_bella(2)+af_sky(1).

More Deployment Guides•View Cloud VPS Plans