Netbird with Self-Hosted Control Plane & Authentik SSO
The most polished self-hosted mesh VPN in the series — Docker Compose, Authentik OIDC, posture checks, exit nodes, and a complete production hardening pass.
Part 1; domain pointed at the VPS; Authentik available
~75 minutes
SSO-driven mesh with policies, posture checks, and exit nodes
Architecture
Netbird has four runtime components:
- • Management server — REST API, holds network state in PostgreSQL or SQLite, owns OIDC integration.
- • Signal server — coordinates peer-to-peer connection setup; brokers offers and answers.
- • Dashboard — React app served from the same domain, talks to the management API.
- • Relay (TURN) — falls back when peer-to-peer fails. CoinTURN or the Netbird relay binary.
The agent (netbird daemon) runs on every peer. Management and signal are public-facing. Everything benefits from a TLS-terminating reverse proxy.
Sizing on RamNode
Deployment Plan Notes
Lab, < 10 peers 2 GB SQLite, single VPS
Small team, 10-50 4 GB PostgreSQL, headroom for relay
50-200 peers 4-8 GB Network throughput on relay matters
200+ peers 8 GB+ Split relay onto its own instancePrerequisites
- • Ubuntu 24.04 VPS with public IP
- • A domain (this guide uses
netbird.example.comandauth.example.com) - • Authentik already deployed (or Zitadel as an alternative)
- • Docker 27+ and Docker Compose v2
Authentik Configuration
Create an OAuth2/OIDC provider in Authentik with these settings:
- • Client type: Confidential
- • Redirect URIs:
https://netbird.example.com/auth/callback,/silent-auth - • Scopes:
openid profile email offline_access - • Subject mode: Based on the User's hashed ID
- • Include claims in id_token: Yes
Create a second OAuth2 provider for service-to-service device flow with a separate client ID/secret. For group sync, add a property mapping that pushes user groups into the groups claim:
return [group.name for group in user.ak_groups.all()]Bind the mapping to the OAuth2 provider with scope name groups.
Deploy Netbird
Create /opt/netbird/docker-compose.yml:
services:
dashboard:
image: netbirdio/dashboard:latest
restart: unless-stopped
environment:
- NETBIRD_MGMT_API_ENDPOINT=https://netbird.example.com
- NETBIRD_MGMT_GRPC_API_ENDPOINT=https://netbird.example.com
- AUTH_AUDIENCE=netbird-client-id
- AUTH_CLIENT_ID=netbird-client-id
- AUTH_CLIENT_SECRET=netbird-client-secret
- AUTH_AUTHORITY=https://auth.example.com/application/o/netbird/
- USE_AUTH0=false
- AUTH_SUPPORTED_SCOPES=openid profile email offline_access groups
- AUTH_REDIRECT_URI=/auth/callback
- AUTH_SILENT_REDIRECT_URI=/silent-auth
- NETBIRD_TOKEN_SOURCE=idToken
labels:
- traefik.enable=true
- traefik.http.routers.dashboard.rule=Host(`netbird.example.com`)
- traefik.http.routers.dashboard.entrypoints=websecure
- traefik.http.routers.dashboard.tls.certresolver=letsencrypt
- traefik.http.services.dashboard.loadbalancer.server.port=80
signal:
image: netbirdio/signal:latest
restart: unless-stopped
volumes: [signal-data:/var/lib/netbird]
labels:
- traefik.enable=true
- traefik.http.routers.signal.rule=Host(`netbird.example.com`) && PathPrefix(`/signalexchange.SignalExchange/`)
- traefik.http.services.signal.loadbalancer.server.port=10000
- traefik.http.services.signal.loadbalancer.server.scheme=h2c
management:
image: netbirdio/management:latest
restart: unless-stopped
depends_on: [signal]
volumes:
- management-data:/var/lib/netbird
- ./management.json:/etc/netbird/management.json
command:
- --port=33073
- --log-file=console
- --disable-anonymous-metrics=true
- --single-account-mode-domain=netbird.example.com
- --dns-domain=netbird.selfhosted
labels:
- traefik.enable=true
- traefik.http.routers.management.rule=Host(`netbird.example.com`) && (PathPrefix(`/api`) || PathPrefix(`/management.ManagementService/`))
- traefik.http.services.management.loadbalancer.server.port=33073
- traefik.http.services.management.loadbalancer.server.scheme=h2c
coturn:
image: coturn/coturn:latest
restart: unless-stopped
network_mode: host
volumes: [./turnserver.conf:/etc/turnserver.conf:ro]
traefik:
image: traefik:v3.1
restart: unless-stopped
ports: ["80:80", "443:443"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik.yml:/etc/traefik/traefik.yml:ro
- traefik-acme:/acme
volumes:
signal-data:
management-data:
traefik-acme:management.json Walkthrough
management.json is the heart of the configuration. The Authentik-specific section:
{
"HttpConfig": {
"Address": "0.0.0.0:33073",
"AuthIssuer": "https://auth.example.com/application/o/netbird/",
"AuthAudience": "netbird-client-id",
"AuthKeysLocation": "https://auth.example.com/application/o/netbird/jwks/",
"AuthUserIDClaim": "sub",
"IdpSignKeyRefreshEnabled": true,
"OIDCConfigEndpoint": "https://auth.example.com/application/o/netbird/.well-known/openid-configuration"
},
"IdpManagerConfig": {
"ManagerType": "authentik",
"ClientConfig": {
"Issuer": "https://auth.example.com/application/o/netbird/",
"TokenEndpoint": "https://auth.example.com/application/o/token/",
"ClientID": "netbird-mgmt-client-id",
"ClientSecret": "netbird-mgmt-client-secret",
"GrantType": "client_credentials"
},
"ExtraConfig": {
"Username": "<authentik-admin-token-user>",
"Password": "<authentik-admin-api-token>"
}
}
}For coturn, restrict the relay to your overlay CIDR (an open TURN is a public proxy):
listening-port=3478
external-ip=<vps-public-ip>
realm=netbird.example.com
fingerprint
lt-cred-mech
user=netbird:<turn-password>
denied-peer-ip=10.0.0.0-10.255.255.255
denied-peer-ip=172.16.0.0-172.31.255.255
denied-peer-ip=192.168.0.0-192.168.255.255
allowed-peer-ip=<your-overlay-cidr>docker compose up -d
docker compose logs -f managementWatch for OIDC connection success on first start — auth misconfiguration shows up in this log.
First Login and Peer Enrollment
Visit https://netbird.example.com, log in via Authentik, land in the dashboard. Create a Setup Key (one-off, reusable, auto-grouped, time-bounded). Install the agent on a peer:
# Ubuntu/Debian
curl -fsSL https://pkgs.netbird.io/install.sh | sh
netbird up --management-url https://netbird.example.com --setup-key <key>Mac, Windows, iOS, and Android clients are available from the official channels. The peer appears in the dashboard with an overlay IP from the default 100.64.0.0/10 range.
Building Policies
Netbird policies are deny-by-default once any policy exists. A policy has source group, destination group, ports/protocols, an enabled flag, and an optional posture-check binding. Three realistic policies:
- • Admins → production-servers — TCP/22
- • Developers → staging-databases — TCP/5432, 6379, 27017
- • All employees → wiki — TCP/80, 443
Group membership comes from setup keys at enrollment, manual UI assignment, or — the production answer — JWT groups claims from Authentik synced automatically.
Posture Checks
Posture checks add conditions beyond identity. Available checks: minimum NB agent version, minimum OS version, geo-location (requires MaxMind GeoLite2 mounted into management), peer network range CIDR, and process check (e.g., antivirus running). Bind a posture check to a policy in the editor; mismatching peers are denied silently — watch the audit log when troubleshooting "but I have access" complaints.
Exit Nodes
Mark a peer as an exit node and route other peers' traffic through it for stable egress IPs, branch egress for compliance, or geo-bypass. In the dashboard toggle "Use as exit node" on the peer, then create a route for 0.0.0.0/0 using that peer as the next hop and apply to a peer group. Clients accept the route with netbird routes accept <route-id>.
Split-Horizon DNS
Netbird runs an internal DNS resolver on every peer at 100.81.0.1. Configure nameserver groups (per-domain upstream resolvers), custom zones (overlay-only A records), and search domains. The killer feature is split-horizon: send *.internal.example.com to your overlay DNS server and *.example.com to 1.1.1.1, transparently, on the same peer.
Observability
The management server exposes Prometheus metrics on /metrics. Useful series:
- •
netbird_management_peers_total - •
netbird_management_grpc_active_streams - •
netbird_management_login_total{status="success|failure"} - •
netbird_signal_active_connections
For peer debugging, netbird status shows connection state, route table, and connection mode (P2P vs relay) per peer.
Backup and Disaster Recovery
Back up: PostgreSQL (or SQLite at /var/lib/netbird/store.json), management.json, signal volume, and Authentik separately if colocated.
#!/bin/bash
set -e
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
DEST=/backup/netbird/$TIMESTAMP
mkdir -p $DEST
docker compose -f /opt/netbird/docker-compose.yml exec -T management \
cat /var/lib/netbird/store.json > $DEST/store.json
cp /opt/netbird/management.json $DEST/
cp /opt/netbird/turnserver.conf $DEST/
tar czf - -C /backup/netbird $TIMESTAMP | \
gpg --encrypt --recipient backups@example.com > /backup/netbird-$TIMESTAMP.tar.gz.gpg
find /backup -name 'netbird-*.tar.gz.gpg' -mtime +30 -deleteRestore is "stop containers, drop in backed-up files, start containers, verify peers reconnect." Test before you need it.
Hardening Checklist
- Never expose
/apiwithout OIDC. Misconfigure the issuer and you have an open admin API. - Strict rate limits at Traefik. Use the rate limit middleware — a flood of
/api/users/mewill pin management. - Rotate the TURN password quarterly. Update both
turnserver.confandmanagement.jsontogether. - Pin agent versions in production fleets. A misbehaving release can cascade.
- Audit log retention. Send to immutable storage — policy changes, enrollments, and login attempts.
- fail2ban on the dashboard host. Drop hammering 401 sources.
- TURN allowed-peer-ip restricts the relay. An open TURN is a public proxy.
- PostgreSQL once you cross 50 peers. SQLite hates concurrent writes during fleet upgrades.
Troubleshooting
- • Peer enrolls but never connects. Signal connectivity.
netbird statusshowssignal connected: false. - • Only relay, never P2P. Both peers behind symmetric NAT. The fix is one of them gets a public endpoint.
- • Policy not taking effect. Deny-by-default once any policy exists. Add a debug ICMP allow and ping.
- • "invalid audience" on login. Audience mismatch between dashboard, management, and Authentik — must match the client ID exactly in all three.
- • Groups not syncing. Decode a token in jwt.io. If
groupsis absent, the Authentik property mapping is not bound to the right scope.
