OLake is an open-source, Go-based extract-and-load platform that replicates operational databases (PostgreSQL, MySQL, MongoDB, Oracle, MSSQL, DB2) and other sources into open lakehouse formats such as Apache Iceberg and Parquet. It runs full loads and change data capture without Spark, Flink, Kafka, or Debezium, and ships a self-serve web UI for configuring sources, destinations, and jobs. This guide deploys the full OLake UI stack on a single RamNode VPS using Docker Compose, then locks it behind Caddy with TLS so the UI is never directly exposed.
OLake is licensed under Apache 2.0.
What you are deploying
The OLake UI is not a single container. The published compose stack brings up several services that work together:
| Service | Role |
|---|---|
| OLake UI | Web interface for sources, destinations, and jobs |
| Temporal worker | Runs the actual replication jobs |
| Temporal server | Workflow orchestration engine |
| Temporal UI | Workflow monitoring and debugging |
| PostgreSQL | Stores job configs and sync state |
| Elasticsearch | Backing store for Temporal workflow data |
| Signup init | One-time job that creates the default admin user |
The UI is exposed on port 8000. The replication jobs themselves spin up additional short-lived connector containers, which is why this stack needs the Docker socket and a bit more memory than a typical web app.
Prerequisites
Because the stack includes Elasticsearch and Temporal alongside Postgres, give it room. A RamNode KVM VPS with at least 4 GB RAM and 2 vCPU is a sensible floor, and 8 GB is more comfortable if you run several concurrent jobs or large tables. Provision generous disk for the persistence directory and any local Parquet output.
This guide assumes Ubuntu 24.04 LTS, a non-root sudo user, and a DNS A record (for example olake.example.com) pointing at the VPS before you begin so Caddy can issue a certificate.
Your destination (Iceberg on S3, MinIO, Glue, a REST catalog such as Lakekeeper or Nessie, or local Parquet) is configured later inside the UI and is not part of this server build. If you write to remote object storage, those credentials are entered in the UI and encrypted at rest.
1. Initial server preparation and hardening
sudo adduser deploy
sudo usermod -aG sudo deploy
sudo apt update && sudo apt -y upgradeHarden SSH in /etc/ssh/sshd_config with PermitRootLogin no and PasswordAuthentication no, then sudo systemctl restart ssh once your key is in place.
Configure the firewall so only SSH and the web ports are reachable. Port 8000, the Postgres port, Elasticsearch, and the Temporal ports all stay closed to the internet.
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enableEnable automatic security updates:
sudo apt -y install unattended-upgrades
sudo dpkg-reconfigure --priority=low unattended-upgradesElasticsearch needs an elevated vm.max_map_count. Set it permanently:
echo 'vm.max_map_count=262144' | sudo tee /etc/sysctl.d/99-olake.conf
sudo sysctl --system2. Install Docker
sudo apt -y install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
| sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" \
| sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update
sudo apt -y install docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo usermod -aG docker deployLog out and back in so the group change takes effect.
3. Fetch and configure the OLake stack
OLake distributes a versioned compose file. Pull it into a working directory so you can edit the defaults before first launch rather than running the one-line quickstart, which would start with insecure defaults.
mkdir -p ~/olake && cd ~/olake
curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml -o docker-compose.ymlBefore starting anything, edit three blocks at the top of the file.
Change the default admin credentials. The stack creates this user on first startup, so set it now:
x-signup-defaults:
username: &defaultUsername "your-admin-username"
password: &defaultPassword "a-long-random-password"
email: &defaultEmail "admin@example.com"Set an explicit persistence path so you know exactly which directory to back up. By default data lands in ${PWD}/olake-data:
x-app-defaults:
host_persistence_path: &hostPersistencePath /var/lib/olakeCreate that directory and make it writable:
sudo install -d -o $USER -g $USER /var/lib/olakeSet an encryption key so source and destination credentials are not stored in plaintext in the metadata database. For a single VPS a passphrase is fine (OLake hashes it with SHA-256); for stricter setups point this at a KMS key ARN:
x-encryption:
key: &encryptionKey "a-strong-passphrase-here"4. Bind the UI to localhost only
The default compose file publishes the UI on 0.0.0.0:8000, which would expose it on the public IP. Pin it to loopback so only the reverse proxy can reach it. Find the OLake UI service ports entry and change it from 8000:8000 to:
ports:
- "127.0.0.1:8000:8000"Do the same for the Temporal UI service if it publishes a host port. Nothing in this stack needs a publicly bound port once Caddy is in front.
5. Start the stack
cd ~/olake
docker compose up -dGive Elasticsearch and Temporal a minute to become healthy, then check:
docker compose psAll services should report healthy or running. The signup-init container runs once and exits, which is expected. The UI is now answering on 127.0.0.1:8000.
6. Reverse proxy and TLS with Caddy
Install Caddy:
sudo apt -y install debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' \
| sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' \
| sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update && sudo apt -y install caddyOLake's UI login is your first line of defense, but you should add a second one at the proxy because the stack also exposes a Temporal UI with no auth of its own. Generate a bcrypt hash for an extra basic-auth gate:
caddy hash-password --plaintext 'your-proxy-password'/etc/caddy/Caddyfile:
olake.example.com {
encode gzip
basic_auth {
opsuser PASTE_THE_BCRYPT_HASH_HERE
}
reverse_proxy 127.0.0.1:8000
}sudo systemctl reload caddyFor an internal-only tool like this, consider also restricting port 443 to known IP addresses, either in the Caddyfile with a remote-IP matcher or at the firewall. OLake holds credentials to your production databases, so treat access to it as sensitive.
7. Backups
OLake keeps its state in two places: the persistence directory you set in section 3, and the metadata Postgres database inside the stack. Back up both.
The persistence directory is a straightforward file copy:
sudo tar czf /var/backups/olake-data-$(date +%F).tar.gz -C /var/lib/olake .Dump the internal Postgres from within its container. Identify the service name from docker compose ps, then:
docker compose exec -T postgresql pg_dumpall -U postgres \
| gzip > /var/backups/olake-meta-$(date +%F).sql.gzSchedule both with cron or a systemd timer and push the archives off the VPS to RamNode object storage or another remote target. Your replicated data itself lives in the destination (S3, Iceberg, or local Parquet) and should be backed up according to that system's own practices. If you write Parquet locally on the VPS, include that output directory in your backup plan as well.
8. Monitoring and alerting
The Temporal UI is the authoritative view of job health: it shows running, completed, and failed workflows and lets you inspect why a sync failed. The temporal-worker logs also surface the periodic log-cleaner activity, which is worth watching to confirm old job logs are being pruned.
For automated alerting, watch container health and failed Temporal workflows. A simple approach is a cron job that runs docker compose ps and flags any unhealthy service, plus a query against the Temporal API for failed workflow counts.
On alert delivery, keep RamNode's mail restrictions in mind. RamNode blocks or throttles direct outbound SMTP on port 25 by default, so any alerting that relies on a local mailer or a raw port-25 connection will fail silently. Route notifications through a transactional email API over HTTPS, a chat webhook (Slack, Discord, or similar), or an authenticated relay on port 587 rather than expecting the VPS to deliver mail directly.
9. Upgrades
OLake changed its compose layout at the end of January 2026, moving to the docker-compose-v1.yml file used in this guide. If you are coming from an older docker-compose.yml, follow OLake's documented migration path rather than swapping files in place, since the persistence layout differs.
For routine upgrades, pull the latest images and recreate:
cd ~/olake
docker compose pull
docker compose up -dYour data and configuration survive because they live in the persistence directory and named volumes, not in the containers. Connector images used by jobs are pulled on demand; pin connector versions to stable releases (for example v0.1.8) rather than latest if you need reproducible job behavior.
10. Troubleshooting
If the UI never comes up, the most common cause is Elasticsearch failing to start because vm.max_map_count is too low. Confirm the sysctl from section 1 took effect with sysctl vm.max_map_count.
If the stack starts but you cannot log in, the signup-init container may have run before you set custom credentials. Check its logs with docker compose logs signup-init, and if needed wipe and recreate with corrected defaults.
If a sync job fails immediately, open the Temporal UI and inspect the workflow. Source connection errors (wrong host, missing replication permissions, CDC not enabled on the source) are the usual culprits and show up clearly there.
If the box runs out of memory under concurrent jobs, reduce job concurrency or move to a larger RamNode plan. Elasticsearch and Temporal together set a meaningful baseline before any replication work begins.
