Text-to-Speech
Apache 2.0
Deploy Kokoro TTS on a RamNode VPS
An OpenAI-compatible speech endpoint backed by the 82M-parameter Kokoro model — CPU-only inference, persistent across reboots, behind nginx + TLS with API key auth.
At a Glance
| Project | Kokoro-82M via Kokoro-FastAPI |
| License | Apache 2.0 |
| Recommended Plan | RamNode Premium NVMe 4 vCPU / 8 GB |
| OS | Ubuntu 24.04 LTS |
| Estimated Setup Time | 30 minutes |
1
Provision + Hardening
Create user + base
adduser kokoro && usermod -aG sudo kokoro
rsync --archive --chown=kokoro:kokoro ~/.ssh /home/kokoro
sudo systemctl reload ssh
sudo apt update && sudo apt upgrade -y
sudo apt install -y ufw fail2ban curl ca-certificates gnupg unattended-upgradesFirewall
sudo ufw default deny incoming && sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp && sudo ufw allow 443/tcp
sudo ufw enable
sudo systemctl enable --now fail2ban2
Install Docker
Add repo + install
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER && newgrp docker3
Deploy Kokoro-FastAPI
Project layout
mkdir -p ~/kokoro && cd ~/kokorodocker-compose.yml
services:
kokoro:
image: ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4
container_name: kokoro-tts
restart: unless-stopped
ports:
- "127.0.0.1:8880:8880"
environment:
- ONNX_NUM_THREADS=4
- ONNX_INTER_OP_THREADS=2
- PYTORCH_NUM_THREADS=4
volumes:
- kokoro-models:/app/api/src/models
- kokoro-voices:/app/api/src/voices
- kokoro-temp:/tmp/kokoro
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8880/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
deploy:
resources:
limits:
memory: 4G
volumes:
kokoro-models:
kokoro-voices:
kokoro-temp:Pull and start
docker compose pull
docker compose up -d
docker compose logs -f4
Verify the Deployment
Health + sample synthesis
curl http://127.0.0.1:8880/health
curl -X POST http://127.0.0.1:8880/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"kokoro","input":"RamNode hosting, now with synthetic speech.","voice":"af_bella","response_format":"mp3"}' \
--output test.mp35
Nginx Reverse Proxy + TLS + API Key Auth
Install nginx + certbot
sudo apt install -y nginx certbot python3-certbot-nginx/etc/nginx/sites-available/kokoro
upstream kokoro_backend {
server 127.0.0.1:8880;
keepalive 32;
}
limit_req_zone $binary_remote_addr zone=kokoro_api:10m rate=10r/s;
limit_conn_zone $binary_remote_addr zone=kokoro_conn:10m;
server {
listen 80;
server_name tts.example.com;
location / { return 301 https://$host$request_uri; }
}
server {
listen 443 ssl http2;
server_name tts.example.com;
server_tokens off;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
client_max_body_size 4M;
proxy_read_timeout 300s;
proxy_buffering off;
limit_req zone=kokoro_api burst=20 nodelay;
limit_conn kokoro_conn 5;
if ($valid_kokoro_key = 0) { return 401; }
location / {
proxy_pass http://kokoro_backend;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header Connection "";
}
}/etc/nginx/conf.d/kokoro-keys.conf
map $http_authorization $valid_kokoro_key {
default 0;
"Bearer REPLACE_WITH_LONG_RANDOM_STRING" 1;
"Bearer SECOND_KEY_FOR_ANOTHER_CLIENT" 1;
}Enable + cert
sudo ln -s /etc/nginx/sites-available/kokoro /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d tts.example.com --redirect --agree-tos -m you@example.com -n6
Restrict the Web UI / Docs
Allowlist /web /docs /debug
location /web {
allow 203.0.113.42;
deny all;
proxy_pass http://kokoro_backend;
proxy_set_header Host $host;
}
location /docs { allow 203.0.113.42; deny all; proxy_pass http://kokoro_backend; }
location /debug { allow 203.0.113.42; deny all; proxy_pass http://kokoro_backend; }7
CPU Tuning
- Set
ONNX_NUM_THREADS= physical core count - Set
ONNX_INTER_OP_THREADS= half of that - For long-form, route to a dedicated container with its own queue
- Watch with
docker stats kokoro-tts
Voices
Naming: [lang][gender]_name (e.g. af_bella, bm_george, jf_*, zf_*). Blend with af_bella+af_sky or weight as af_bella(2)+af_sky(1).
