IoT Build Server
    Caddy TLS

    Deploy ESPHome Dashboard on a VPS

    Self-host the ESPHome dashboard on a RamNode VPS — Caddy TLS, fail2ban, OTA over Tailscale or WireGuard, and automated config and binary backups.

    ESPHome is a YAML-driven firmware framework for ESP8266, ESP32, and RP2040 microcontrollers. The ESPHome dashboard is the web UI that compiles firmware, flashes devices over the air, streams logs, and manages secrets across an entire fleet. Running the dashboard on a RamNode VPS gives you a stable, always-on build server reachable from anywhere, with none of the noise and uptime concerns of a Raspberry Pi under a desk.

    This guide deploys ESPHome in Docker behind a Caddy reverse proxy with TLS, locks the dashboard behind authenticated access, configures fail2ban against brute force, and sets up automated backups of YAML configs and compiled binaries. It also covers the network topology required so that devices in your home can receive OTA updates from a cloud-hosted dashboard.

    What This Solves

    A typical ESPHome workflow involves editing YAML, compiling firmware, and pushing the binary to a device. The compile step is the heavy part. It needs Python, PlatformIO, and a full ESP toolchain, easily 4 GB of disk and a CPU-bound minute or two per build. Doing this on a small SBC works but is slow and disk-hungry. A VPS with a couple of vCPUs cuts build times significantly and stays available for late-night fixes.

    The catch is OTA updates. Devices at home need to download the compiled binary, which means either the device pulls it from the VPS (works if the device has outbound internet), or you tunnel the VPS into your LAN. Both are covered below.

    Resource Requirements

    ESPHome's compile workload is the binding constraint:

    • CPU: 2 vCPU minimum, 4 recommended for parallel builds
    • RAM: 2 GB minimum, 4 GB if you maintain a large fleet or multi-target firmware
    • Disk: 20 GB SSD. Each platform target (esp32, esp32s3, esp8266) caches a separate toolchain in PlatformIO's cache directory, and these accumulate quickly.
    • OS: Ubuntu 24.04 LTS or Debian 12

    A RamNode plan with 4 GB RAM and 2-4 vCPU is the sweet spot. Smaller plans work but compiles will run noticeably slower.

    Prerequisites

    • A RamNode VPS with Ubuntu 24.04 installed
    • A domain or subdomain pointed at the VPS (A record on esphome.example.com)
    • SSH access as a non-root sudo user
    • Either devices with outbound internet access OR a mesh VPN (Tailscale, WireGuard, ZeroTier) connecting the VPS to your home network

    For OTA to work from the cloud dashboard, the device needs to be able to receive a connection from the dashboard during the OTA process. This is the part that catches people. If your devices are behind NAT (the usual case), you need the VPS reachable on the same logical network as the devices. Tailscale is the simplest path.

    Initial Server Hardening

    shell
    sudo apt update && sudo apt upgrade -y
    sudo apt install -y ufw fail2ban unattended-upgrades curl gnupg ca-certificates
    sudo dpkg-reconfigure --priority=low unattended-upgrades

    Configure UFW:

    shell
    sudo ufw default deny incoming
    sudo ufw default allow outgoing
    sudo ufw allow 22/tcp comment 'SSH'
    sudo ufw allow 80/tcp comment 'HTTP for ACME'
    sudo ufw allow 443/tcp comment 'HTTPS'
    sudo ufw enable

    The ESPHome dashboard listens on 6052 internally and is never exposed directly to the public internet. All access goes through Caddy on 443.

    Install Docker

    shell
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    sudo apt update
    sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
    sudo usermod -aG docker $USER
    newgrp docker

    Install Tailscale

    Skip this section if your devices have direct outbound internet to your VPS over a port you control. Most home deployments need a tunnel.

    shell
    curl -fsSL https://tailscale.com/install.sh | sh
    sudo tailscale up --ssh

    Note the VPS Tailscale IP (something like 100.x.x.x). Install Tailscale on the same network as your ESPHome devices, either on a router that supports it (OPNsense, GL.iNet, OpenWrt with the package) or as a subnet router on a small machine. Configure the subnet router to advertise your home LAN subnet:

    shell
    sudo tailscale up --advertise-routes=192.168.1.0/24

    Approve the route in the Tailscale admin console, then verify from the VPS:

    shell
    ping 192.168.1.10

    You should reach your ESP devices by their LAN IP from the VPS.

    Directory Layout

    shell
    sudo mkdir -p /opt/esphome/{config,build}
    sudo chown -R $USER:$USER /opt/esphome

    The config directory holds your YAML files and is the most precious thing to back up. The build directory holds PlatformIO caches and compile artifacts and can be regenerated, although doing so adds minutes to the first compile after a wipe.

    Docker Compose Manifest

    Create /opt/esphome/docker-compose.yml:

    shell
    services:
      esphome:
        image: ghcr.io/esphome/esphome:latest
        container_name: esphome
        restart: unless-stopped
        network_mode: host
        volumes:
          - ./config:/config
          - ./build:/config/.esphome
          - /etc/localtime:/etc/localtime:ro
        environment:
          - USERNAME=admin
          - PASSWORD=CHANGE_ME_TO_STRONG_PASSWORD
          - ESPHOME_DASHBOARD_USE_PING=true
        healthcheck:
          test: ["CMD", "curl", "-f", "http://127.0.0.1:6052"]
          interval: 30s
          timeout: 10s
          retries: 3

    Two important choices here:

    • network_mode: host is required for mDNS discovery to work, which is how ESPHome finds devices on the local network. Inside a Docker bridge, mDNS broadcasts don't traverse and devices appear as offline in the dashboard. Since the dashboard binds to 6052 and we firewall that port, host networking is safe.
    • ESPHOME_DASHBOARD_USE_PING=true switches from mDNS to ICMP for device status checks. This is essential when devices are reached over a VPN, since mDNS does not traverse Tailscale. If all your devices are on a routable subnet, this gives accurate up/down status without flapping.

    The USERNAME and PASSWORD environment variables enable the built-in authentication layer. We will also put Caddy basic auth in front, giving defense in depth.

    Bring it up:

    shell
    cd /opt/esphome
    docker compose up -d
    docker compose logs -f

    Wait for Starting dashboard on port 6052 and confirm with curl -I http://127.0.0.1:6052 (should return 401 because of auth).

    Reverse Proxy with Caddy

    Install Caddy:

    shell
    sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https
    curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
    curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
    sudo apt update
    sudo apt install -y caddy

    Generate a basic auth hash:

    shell
    caddy hash-password

    Replace /etc/caddy/Caddyfile:

    shell
    esphome.example.com {
        basic_auth {
            admin BCRYPT_HASH_HERE
        }
        reverse_proxy 127.0.0.1:6052 {
            transport http {
                keepalive 30s
            }
        }
        encode gzip
        log {
            output file /var/log/caddy/esphome.log
            format json
        }
        @websockets {
            header Connection *Upgrade*
            header Upgrade websocket
        }
        reverse_proxy @websockets 127.0.0.1:6052
    }

    The websocket matcher is critical. The ESPHome dashboard streams live build output and device logs over WebSockets, and a misconfigured proxy will cause the log panel to hang on a blank screen.

    Reload:

    shell
    sudo systemctl reload caddy

    Verify TLS issuance:

    shell
    sudo journalctl -u caddy -n 50

    Look for a certificate obtained successfully line for esphome.example.com.

    Hardening Authentication

    You now have two layers of password auth: Caddy basic auth and the ESPHome dashboard itself. This is intentional. The Caddy layer is a cheap filter that drops unauthenticated bots before they hit the Python application. The dashboard layer protects against any case where Caddy is bypassed (rare but possible if Tailscale exposes the dashboard to the tailnet).

    If you want a single sign-on experience, replace the Caddy basic auth with a forward-auth integration to Authentik or Authelia. The relevant Caddyfile block becomes:

    shell
    esphome.example.com {
        forward_auth authentik:9000 {
            uri /outpost.goauthentik.io/auth/caddy
            copy_headers X-Authentik-Username X-Authentik-Groups X-Authentik-Email X-Authentik-Name X-Authentik-Uid
        }
        reverse_proxy 127.0.0.1:6052
    }

    This requires a separate Authentik deployment, which is outside the scope of this guide.

    fail2ban for Caddy

    Create /etc/fail2ban/filter.d/caddy-auth.conf:

    shell
    [Definition]
    failregex = ^.*"remote_ip":"<HOST>".*"status":401.*$
    ignoreregex =

    Create /etc/fail2ban/jail.d/caddy.local:

    shell
    [caddy-auth]
    enabled = true
    port = http,https
    filter = caddy-auth
    logpath = /var/log/caddy/esphome.log
    maxretry = 8
    findtime = 600
    bantime = 86400

    Restart fail2ban:

    shell
    sudo systemctl restart fail2ban
    sudo fail2ban-client status caddy-auth

    First Login and Initial Configuration

    Open https://esphome.example.com. Caddy basic auth will prompt, then the ESPHome dashboard's own login screen will appear. Sign in with the username and password set in the Compose env.

    Create a secrets.yaml in the dashboard's secrets editor before defining any devices:

    shell
    wifi_ssid: "your-iot-ssid"
    wifi_password: "your-iot-password"
    ota_password: "use-openssl-rand-hex-16"
    api_encryption_key: "32-byte-base64-key-from-esphome-helper"

    Generate api_encryption_key from any device's Show API Key button or with:

    shell
    docker exec esphome esphome config-validation-secret-key

    For the OTA password, use openssl rand -hex 16.

    Device OTA Considerations

    When you click Install from the dashboard, ESPHome compiles the firmware on the VPS and then needs to push it to the device. There are three modes:

    1. OTA: Wirelessly: The dashboard initiates a TCP connection to the device on port 3232 (esp32) or 8266 (esp8266). The device must be reachable from the VPS at its local IP, which only works if you have Tailscale or a similar VPN bridging the networks.

    2. Manual download: The dashboard hands you a .bin file. You upload it to the device using the web flasher or esptool.py from any machine on the LAN. This works regardless of network topology but breaks the unattended workflow.

    3. OTA via HTTP request from device: ESPHome devices can be configured with http_request and a periodic check against a known URL. This pull model works through NAT but requires custom YAML and is rarely worth the complexity.

    For most deployments, mode 1 with Tailscale is the right answer. Confirm that devices' static IPs are in your config and that use_address matches the Tailscale-reachable address:

    shell
    wifi:
      ssid: !secret wifi_ssid
      password: !secret wifi_password
      manual_ip:
        static_ip: 192.168.1.50
        gateway: 192.168.1.1
        subnet: 255.255.255.0
      use_address: 192.168.1.50

    Without use_address, the dashboard tries to resolve the device by hostname over mDNS, which won't work across Tailscale.

    Secrets and Config Backups

    The config directory is the source of truth. Lose it and you lose every device YAML, every secret, and every customization. Back it up nightly.

    Create /usr/local/sbin/esphome-backup.sh:

    shell
    #!/bin/bash
    set -euo pipefail
    BACKUP_DIR="/var/backups/esphome"
    TS=$(date +%Y%m%d_%H%M%S)
    mkdir -p "$BACKUP_DIR"
    # Exclude the build cache - it is large and regenerable
    tar --exclude='.esphome' --exclude='.pioenvs' --exclude='.piolibdeps' \
        -czf "$BACKUP_DIR/esphome-config-$TS.tar.gz" \
        -C /opt/esphome config
    find "$BACKUP_DIR" -name 'esphome-config-*.tar.gz' -mtime +30 -delete

    Make executable and schedule:

    shell
    sudo chmod +x /usr/local/sbin/esphome-backup.sh
    echo "0 3 * * * root /usr/local/sbin/esphome-backup.sh" | sudo tee /etc/cron.d/esphome-backup

    Ship these to off-site object storage with rclone or restic. The configs are small (KB range) and a year's worth fits in any free tier.

    Compile Cache Management

    The PlatformIO build cache lives at /opt/esphome/build and grows monotonically. Periodically clean stale environments to free disk:

    shell
    docker exec esphome rm -rf /config/.esphome/build

    After this, the next compile for each device will take 60-120 seconds longer as the toolchain re-downloads. For active fleets, run this cleanup quarterly or when disk usage crosses 70 percent.

    Updates

    shell
    cd /opt/esphome
    docker compose pull
    docker compose up -d
    docker image prune -f

    ESPHome ships breaking changes between major versions occasionally. Before any update, check the latest changelog at esphome.io/changelog. Pin to a specific version tag in production if you maintain a large fleet:

    shell
    image: ghcr.io/esphome/esphome:2025.5.0

    This lets you control the upgrade window rather than picking up changes on the next docker compose pull.

    Monitoring

    A few signals worth tracking:

    • Container health: The healthcheck in the Compose file returns failure if the dashboard is unreachable. Hook this into your alerting (Uptime Kuma, Healthchecks.io) by configuring a passive check that fires on container restart events.

    • Disk usage: /opt/esphome can blow up. Set a cron alert at 80 percent:

    shell
    echo "0 * * * * root df -h /opt/esphome | awk 'NR==2 {if (\$5+0 > 80) print \$0}' | tee /tmp/esphome-disk.warn" | sudo tee /etc/cron.d/esphome-disk
    • Build times: If a device that used to compile in 60 seconds now takes 5 minutes, the build cache may be corrupt. Clear it (see above) and rebuild.

    • Failed authentications: Watch fail2ban-client status caddy-auth for sustained ban activity. Scripted attempts are usually a sign that someone has discovered your subdomain. The dashboard URL is not a secret, but advertising it on public infrastructure inventory sites invites traffic.

    Common Issues

    • Dashboard reports devices as offline even though they are online: You are running with ESPHOME_DASHBOARD_USE_PING=true but UFW is blocking ICMP egress to the device subnet. Or, you forgot to set use_address in device configs when reaching them over Tailscale.

    • OTA fails with Connection refused: The dashboard reached the device's IP but nothing was listening. Most often this is a stale IP in config after the device pulled a new DHCP lease. Set static IPs for devices managed remotely.

    • Compile fails with No space left on device: Disk is full, usually the build cache. Clean it as described above and consider upgrading the VPS storage tier.

    • WebSocket connection drops mid-build: Caddy transport timeout is too low or you have a proxy in front of Caddy (like Cloudflare) that is closing idle connections. Cloudflare's free tier has a 100-second WebSocket timeout, which can break long compiles. Either pay for the higher tier or bypass Cloudflare for this subdomain.

    • network_mode: host is not an option (rootless docker): Use bridge mode with a published port and accept that mDNS device discovery won't work; rely on ESPHOME_DASHBOARD_USE_PING=true and static IPs.

    This dashboard is now production-ready. The next improvements to consider are git-backed config storage (mount the config directory from a git repo so every change is versioned and pushed off-server immediately) and a CI pipeline that validates YAML on commit before the dashboard sees it.