14 — Beszel System Metrics¶

Behind the Caddy reverse proxy (May 2026). The Beszel hub binds 127.0.0.1:8090 only and is reached at https://beszel.z2mini.gabrielgabrie.com (auto-renewing Let's Encrypt cert via Caddy; the hub's APP_URL was updated to match). On the box itself: http://127.0.0.1:8090. The old http://z2mini:8090 / http://100.67.235.68:8090 URLs no longer work. The agent still talks to the hub over the unix socket (its HUB_URL is now http://127.0.0.1:8090 — harmless either way in socket mode).

A lightweight system-metrics dashboard with continuous graphs of CPU, memory, disk, network, and per-container stats — the "what is the server actually doing right now" view that complements smartd's drive-failure email alerts and Homepage's at-a-glance launcher tiles.

Companion to 13-homepage.md. Homepage answers "where do I click?"; Beszel answers "is the server healthy?". The two are deliberately separate tools — both are lightweight, neither replaces the other, and email alerting is split: smartd handles drive health, Beszel handles everything else.

Overview¶

Two-container Docker stack, hub + agent on the same host:

Container	Image	Purpose
`beszel`	`henrygd/beszel:0.18.7`	Hub: web UI, SQLite-backed time-series DB, alert engine, SMTP sender
`beszel-agent`	`henrygd/beszel-agent:0.18.7`	Agent: collects CPU/RAM/disk/network/sensor/Docker-container metrics from the host

Hub reachable at https://beszel.z2mini.gabrielgabrie.com from any tailnet device. Not exposed to the public internet. Agent is not network-exposed — it communicates with the hub via a unix socket in a shared volume.

The two-process design future-proofs adding more machines (laptop, future VPS, off-site backup box at parents') without restructuring — extra agents can register against the same hub.

Design decisions¶

Decision	Reasoning
Beszel over Netdata / Glances / Grafana+Prometheus	Single Go binary per side, ~50 MB RAM total, modern UI, built-in alerting with email; Netdata is heavier and nags you toward its cloud; Grafana stack is 4+ containers and hours of YAML for one server
Hub + agent both in this compose, communicating via unix socket	Standard Beszel pattern for single-host. Avoids exposing the agent on any TCP port — it listens on a socket in a shared volume the hub bind-mounts. `network_mode: host` on the agent is still required for metric access (it reads `/proc` and the host network stack)
Hub bound to `127.0.0.1:8090`, fronted by Caddy	The hub web UI lives behind the single Caddy ingress at `https://beszel.z2mini.gabrielgabrie.com`; never reachable on the tailnet directly or publicly. (Was `100.67.235.68:8090` before the Caddy migration.) See 17-caddy.md.
Image pinning via `${BESZEL_VERSION}` in `.env`	Updates are deliberate (`docker compose pull` + restart), never automatic. Same posture as the other stacks.
`restart: unless-stopped` (not `always`)	Survives reboots, but `docker compose stop` actually stops
Email alerts directly to `smtp.gmail.com:587` from the hub (not piped through msmtp)	Beszel hub has built-in SMTP. Direct-to-Gmail with a separate app password keeps revocation independent — if Beszel's app password leaks, smartd's email channel via msmtp keeps working unaffected
Email alerts complement smartd, don't replace it	smartd handles drive-failure alerts (low-level SMART thresholds); Beszel handles host-level alerts (CPU pegged, RAM exhausted, disk filling, container down, agent unreachable). Different signal classes — keep both.
SQLite DB (no Postgres)	Beszel ships its own embedded DB; one less moving part. Same posture as Navidrome.
Agent `network_mode: host`	Required by Beszel — `/proc/net/dev` and similar are container-isolated under bridge networking, so agent must share the host network namespace to see real interface stats
Docker socket mounted read-only on agent	Lets the agent enumerate Docker containers and read per-container CPU/RAM/network. `:ro` is the safety property — the agent cannot start/stop containers via this mount.

Evaluated and not chosen: Netdata (~150 MB RAM, opinionated cloud upsell, more metrics than needed at this scale); Glances (a souped-up htop over HTTP, no time-series persistence — graphs are real-time only, no history); Grafana + Prometheus + node_exporter + cAdvisor (the right answer for a job, overkill for one homelab — 4 containers vs 2, hours of dashboards-as-YAML setup, real learning curve); Checkmk / Zabbix / Nagios (enterprise monitoring, wrong scale entirely).

Install¶

Directory layout¶

/data/docker/beszel/
├── docker-compose.yml     ← stack definition, version-pinned
├── .env                   ← version pin + agent KEY/TOKEN (mode 600 once filled)
├── data/                  ← hub SQLite DB + uploaded assets (PocketBase data)
├── socket/                ← unix socket shared between hub and agent
└── agent-data/            ← agent's local fingerprint / metrics buffer

/data/.beszel/             ← empty placeholder dir; bind-mounted into agent so
                              it can statvfs() /data without read access to its contents
/mnt/backup/.beszel/       ← same pattern for the USB backup drive

`docker-compose.yml`¶

Mirrors the upstream Beszel template verbatim except for these intentional customizations:

Version pinned via ${BESZEL_VERSION} from .env (not :latest)
Hub port published on 127.0.0.1 only ("127.0.0.1:8090:8090") — fronted by Caddy at https://beszel.z2mini.gabrielgabrie.com (see 17-caddy.md)
APP_URL set to https://beszel.z2mini.gabrielgabrie.com (the Caddy hostname) so email links and the web UI's generated URLs are correct
All bind-mount paths anchored under /data/docker/beszel/
Agent uses unix socket via shared socket/ volume (no TCP listener)
TZ set explicitly to America/Toronto
Agent has extra-filesystems mounts for /data and /mnt/backup (via .beszel/ placeholder subdirs — Beszel reads filesystem stats on the mount path without needing read access to actual contents). The container path carries a __<Label> suffix (/extra-filesystems/data__Data, /extra-filesystems/backup__Backup) so the hub UI shows friendly disk names — "Data" and "Backup" — instead of bare device names. See Disk monitoring below.
Agent HUB_URL set to http://127.0.0.1:8090 — the agent runs network_mode: host, so 127.0.0.1 is the host's loopback where the hub now listens. (In socket mode the hub↔agent link is the unix socket anyway, so HUB_URL is mostly a formality.)
Agent SENSORS: "" (empty) — disables temperature collection. The HP Z2 Mini exposes an hp-isa-0000 hwmon device that hard-blocks Beszel's sensor probe (no readable temps but the read syscall doesn't return), causing a temperature-collection-timeout warn every 60s. A whitelist (SENSORS: "coretemp_*,nvme_*,...") and a longer SENSORS_CONTEXT_TIMEOUT were both tried first; neither helped because the slow probe runs before the whitelist filter. CPU temps are still observable via sensors on-host if needed.

name: beszel

services:
  beszel:
    container_name: beszel
    image: henrygd/beszel:${BESZEL_VERSION}
    restart: unless-stopped
    environment:
      APP_URL: https://beszel.z2mini.gabrielgabrie.com
      TZ: America/Toronto
    ports:
      - "127.0.0.1:8090:8090"
    volumes:
      - "/data/docker/beszel/data:/beszel_data"
      - "/data/docker/beszel/socket:/beszel_socket"

  beszel-agent:
    container_name: beszel-agent
    image: henrygd/beszel-agent:${BESZEL_VERSION}
    restart: unless-stopped
    network_mode: host
    env_file: .env
    environment:
      LISTEN: /beszel_socket/beszel.sock
      NETWORK: unix
      HUB_URL: http://127.0.0.1:8090
      TZ: America/Toronto
      SENSORS: ""
    volumes:
      - "/data/docker/beszel/agent-data:/var/lib/beszel-agent"
      - "/data/docker/beszel/socket:/beszel_socket"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/data/.beszel:/extra-filesystems/data__Data:ro"
      - "/mnt/backup/.beszel:/extra-filesystems/backup__Backup:ro"

`.env`¶

# Version pin — same version for hub and agent. Verify current stable:
#   curl -s https://api.github.com/repos/henrygd/beszel/releases/latest | grep tag_name
BESZEL_VERSION=0.18.7

# Agent SSH key + websocket token — populated AFTER first hub boot.
# Both come from the hub web UI when you "Add System." Leave blank for the
# initial hub-only boot, then fill in and `docker compose up -d` to start the
# agent. Mode 600 once these are filled.
KEY=
TOKEN=

Phase 1 — bring up the hub alone¶

The agent can't start without KEY and TOKEN, which are generated by the hub UI on first system registration. So the install is two-phase: hub first, then UI bootstrap, then agent.

mkdir -p /data/docker/beszel/{data,socket,agent-data}
mkdir -p /data/.beszel /mnt/backup/.beszel    # placeholder dirs for extra-filesystem stats
cd /data/docker/beszel
# Write docker-compose.yml + .env (KEY and TOKEN blank for now)
docker compose config --quiet      # validate YAML + variable resolution
docker compose pull                # ~50 MB total for both images
docker compose up -d beszel        # start ONLY the hub
docker compose ps                  # confirm "Up X seconds"

Open https://beszel.z2mini.gabrielgabrie.com in your browser. Set up the admin account (email + password — separate user database from any other service).

Phase 2 — register the system, get KEY + TOKEN¶

In the hub UI:

Top-right "+ Add System."
Name: z2mini. Host/IP: /beszel_socket/beszel.sock (the unix socket path inside the hub container — Beszel auto-detects this is a socket and switches to socket mode).
Click "Save." The UI shows two strings: a public key (ssh-ed25519 AAAA...) and a token (UUID-like).
Copy both into .env:

chmod 600 /data/docker/beszel/.env

Edit the file. Docker Compose's env_file parser supports values with spaces unquoted, so write the values as-is — no surrounding quotes:

KEY=ssh-ed25519 AAAA...
TOKEN=abcd1234-...-...

(If you wrap the KEY in quotes, Compose treats them as literal characters and the agent rejects the malformed key.)

Phase 3 — start the agent¶

cd /data/docker/beszel
docker compose up -d beszel-agent
docker compose ps
docker compose logs --tail 20 beszel-agent

Within ~30 seconds the system in the hub UI flips from "Pending" → "Up" with green metrics streaming in. Refresh the browser tab.

Configure email alerts¶

Beszel uses PocketBase under the hood; SMTP is configured in the hub admin settings, not via env vars.

Get a separate Gmail app password¶

Why separate from msmtp: if a token is ever compromised, you want to revoke that one credential without disturbing the other email channel. smartd → msmtp must keep working independently of Beszel.

Open https://myaccount.google.com/apppasswords (signed in as gabrielgabrie99@gmail.com).
App name: beszel-z2mini. Generate. Copy the 16-character password.

Configure SMTP in the hub UI¶

In the Beszel hub UI:

Click your avatar (top-right) → "Settings."
Mail Settings tab → enable "Send mail with SMTP server."
Fill in:

Field	Value
SMTP server host	`smtp.gmail.com`
SMTP server port	`587`
Username	`gabrielgabrie99@gmail.com`
Password	(the Beszel-specific app password from above)
TLS	Off (use STARTTLS — the default port-587 + STARTTLS combo is what Gmail expects)

Click "Save."
Click "Send test email." Inbox should receive Test email from Beszel within ~10 seconds.

Configure alerts on the system¶

Per-system alerts are configured in the system detail view, not globally:

Click z2mini in the system list.
"Alerts" tab → enable the alerts you want. Sensible defaults for this server:

Alert	Threshold	Why
CPU	> 80% for 10 min	Catches runaway processes; normal Immich ingest can spike short-term
Memory	> 85% for 5 min	64 GB box, sustained 85% means something's wrong
Disk	`/data` > 90%	Same disk-fill class of incident as the iCloud-import migration
Disk	`/` > 80%	OS drive shouldn't fill — Docker images live there
Status	Down for > 2 min	Agent unreachable → host or container died
Container down	any container down for > 2 min	Catches Immich/Navidrome/Homepage crashes

"Save."

A test email isn't built into the alert config; verify the channel via "Send test email" in Mail Settings instead.

Disk monitoring¶

The agent only reports disk stats for filesystems that are bind-mounted into its container. The OS root drive is auto-detected (Docker bind-mounts /etc/resolv.conf from the host's root fs, so Beszel can statvfs() it). For the other two filesystems — /data (the SK hynix internal NVMe) and /mnt/backup (the 990 PRO over USB) — the agent's compose bind-mounts an empty .beszel/ placeholder directory on each:

    volumes:
      # ...
      - "/data/.beszel:/extra-filesystems/data__Data:ro"
      - "/mnt/backup/.beszel:/extra-filesystems/backup__Backup:ro"

The placeholder dir gives Beszel statvfs() info on the filesystem without exposing its actual contents (the dir is empty and mounted :ro).

The __<Label> suffix on the container path is how you set the displayed name in the hub UI. /extra-filesystems/data__Data shows up as "Data", /extra-filesystems/backup__Backup shows up as "Backup" — instead of bare device names. The result in the Beszel hub:

Beszel UI name	Filesystem	Device
Data	`/data`	SK hynix internal NVMe
Backup	`/mnt/backup`	Samsung 990 PRO 1TB over USB (ASM2462 enclosure)
`nvme1n1p2`	`/` (OS root)	Samsung internal NVMe — shows as the device name; Beszel doesn't support renaming the root/main filesystem

Before the relabel (and before the backup drive was swapped), the agent showed bare device names and was still pointing at the now-gone sda1 until it was restarted. The rollback for this change is /data/docker/beszel/docker-compose.yml.pre-disklabels on the server. After relabeling, the hub briefly shows stale "ghost" disk entries (nvme0n1p1, sdb1, sda1) until they age out of the hub's retention — harmless.

If the backup drive is ever swapped again, recreate the placeholder dir on the new drive (mkdir /mnt/backup/.beszel), then restart the agent (docker restart beszel-agent) — it re-detects the new device behind /extra-filesystems/backup__Backup. No compose change needed. (This is part of the backup-drive-swap checklist in 05-backups.md.)

Operations¶

Start / stop / pull updates¶

cd /data/docker/beszel
docker compose up -d           # start (or recreate after pull)
docker compose stop            # stop both containers
docker compose down            # stop and remove (data preserved)

# Updates: bump BESZEL_VERSION in .env, then:
docker compose pull
docker compose up -d

The hub and agent are pinned to the same version. Don't update them independently — the hub-agent wire protocol can change between minor versions, and a version skew breaks metric ingest until both are aligned.

Logs¶

docker compose logs -f --tail 50 beszel              # hub
docker compose logs -f --tail 50 beszel-agent        # agent

Disk usage¶

du -sh /data/docker/beszel/data/ /data/docker/beszel/agent-data/

The hub SQLite DB grows at roughly 5-10 MB per system per month at the default 1-minute granularity — negligible at this scale. Beszel auto-prunes old data per the retention setting in the hub UI (Settings → Data Retention).

Connecting from on the server itself¶

The hub binds 127.0.0.1:8090, so scripts/tools on z2mini use http://127.0.0.1:8090 (or http://localhost:8090). https://beszel.z2mini.gabrielgabrie.com also works from the box (via Caddy). http://z2mini:8090 and http://100.67.235.68:8090 no longer work — those bindings were removed when the hub moved behind Caddy. From other tailnet devices, the only way in is https://beszel.z2mini.gabrielgabrie.com.

The agent (network_mode: host) reaches the hub at http://127.0.0.1:8090 — same loopback as the host. In practice they communicate over the unix socket, so this is mostly a formality.

Backup considerations¶

Beszel is in the nightly backup (since May 2026 — see 05-backups.md):

data/data.db AND data/auxiliary.db — the hub's SQLite databases (the time-series metrics history is the main thing of operational value: losing it means losing "what did the server look like 3 weeks ago"). Both are captured via a host sqlite3 "<live db>" ".backup '<dest>'" into /mnt/backup/current/db-dumps/beszel-data.db and beszel-auxiliary.db. Never raw-rsync an open SQLite DB — same corruption risk as Navidrome's navidrome.db and Vaultwarden's db.sqlite3. Both also go to the off-site T5 (they're in db-dumps/).
.env (contains the agent KEY and TOKEN, mode 600) + docker-compose.yml — rsync'd into /mnt/backup/current/service-config/beszel/.

What's not backed up:

data/id_ed25519 — the hub's SSH key. It's root-owned, so the backup user (gabriel) can't read it — it's silently excluded. It's regenerable, but regenerating it means re-registering the one agent (the agent verifies the hub's key). On a restore, Beszel just generates a fresh one on first boot and you re-pair the agent. (This is why the service-config/beszel/ rsync excludes the whole data/ dir — the DBs are captured as dumps and the rest is either unreadable or regenerable.)
agent-data/ — agent fingerprint + buffer. Rebuildable by re-registering the system.
socket/ — the unix socket; transient, never back up.

Restore: drop db-dumps/beszel-data.db → data/data.db and db-dumps/beszel-auxiliary.db → data/auxiliary.db, restore .env + compose from service-config/beszel/, docker compose up -d (hub regenerates id_ed25519), then re-register the agent in the hub UI (new KEY/TOKEN into .env, docker compose up -d beszel-agent). See 08-recovery.md → Step 6b.

Troubleshooting¶

Hub UI loads but says "No systems":

Phase 2 wasn't run yet. Click "+ Add System" and use /beszel_socket/beszel.sock as the host.

System added but shows "Pending" forever:

Agent isn't running. docker compose ps — beszel-agent should be Up.
Agent is running but KEY or TOKEN is empty/wrong. docker compose logs beszel-agent | grep -iE 'key|token|auth'.
Common pitfall: copying KEY from the UI without quotes — the value contains spaces, the env file needs KEY="ssh-ed25519 ..." not KEY=ssh-ed25519 ....
Restart the agent after fixing: docker compose up -d beszel-agent (re-reads .env).

Email test from Mail Settings fails with "STARTTLS error":

Wrong port or TLS toggle. Gmail expects port 587 with TLS toggle OFF (Beszel uses STARTTLS automatically on port 587). Port 465 with TLS toggle ON is the alternate; don't mix.
App password is a 16-char string with NO SPACES. Google displays it with spaces for readability — strip them when pasting.

No alerts firing despite metrics looking high:

Alerts are per-system, configured under the system detail view, not globally. Check you're on the z2mini page, not the dashboard.
Alert thresholds must be sustained for the duration window. A 30-second CPU spike won't trigger a "10-minute over 80%" alert by design.

Agent shows "0%" CPU and disk for /:

Agent isn't running in network_mode: host. Check docker compose config | grep network_mode. Without host mode, agent reads container-internal /proc, which is meaningless.

Agent only reports the OS drive — no /data or /mnt/backup panel:

The agent only sees filesystems that are bind-mounted into its container. The OS drive is auto-detected via /etc/resolv.conf (Docker bind-mounts that from the host's root fs). For other filesystems, you must explicitly mount them under /extra-filesystems/<name>__<Label>/.
Pattern: create an empty .beszel/ placeholder directory on the target filesystem (e.g., mkdir /data/.beszel), then add - /data/.beszel:/extra-filesystems/data__Data:ro to the agent's volumes: and docker compose up -d beszel-agent. The placeholder approach gives Beszel statvfs() info on the filesystem without exposing actual file contents; the __Data suffix sets the displayed name. See Disk monitoring above.

Beszel still shows sda1 (or another stale device) instead of "Backup":

The agent caches the device behind each /extra-filesystems/<x> mount until it's restarted. After the backup drive was swapped (T5 → 990 PRO), the agent kept reporting the old sda1 until docker restart beszel-agent. Restart it. (The hub also keeps stale "ghost" entries — nvme0n1p1, sdb1, sda1 — for a while after a relabel; those age out of the hub's retention on their own, harmless.)

Backup drive shows up but the OS root is named nvme1n1p2, not something friendlier:

Expected. Beszel doesn't support renaming the root/main filesystem — it always shows the device name there. Only /extra-filesystems/<x>__<Label> mounts get a custom name. Not a bug.

Agent log shows WebSocket connection failed err="dial tcp 127.0.0.1:8090: connect: connection refused":

HUB_URL doesn't match where the hub is actually listening. Since the hub moved behind Caddy it binds 127.0.0.1:8090 (loopback), and the agent runs network_mode: host, so it shares the host's loopback — set HUB_URL: http://127.0.0.1:8090 in the agent's environment and recreate. (The hub↔agent link is the unix socket regardless; this just keeps the WebSocket path clean instead of falling back to SSH with a noisy warn.)

Agent log shows WARN Error updating temperatures err="temperature collection timed out" every 60s:

The HP Z2 Mini exposes an hp-isa-0000 hwmon device (visible in sensors output but with no readable temp values) that hard-blocks Beszel's per-sensor probe. The blocking happens before any SENSORS whitelist filter, so whitelisting didn't help in testing; raising SENSORS_CONTEXT_TIMEOUT to 5s also didn't help — the underlying read just doesn't return.
Pragmatic fix: set SENSORS: "" (empty string) to disable temperature collection entirely. CPU temps remain observable on-host via sensors, and the GPU has its own monitoring path via nvidia-smi.
Revisit only if hot-running becomes an operational concern; at that point investigate which specific hwmon node is blocking and either unbind it from the kernel or use SENSORS to whitelist working ones (and discover whether the block is on probe-list or per-device read).

Container shows "down" in Beszel but docker compose ps shows it up:

Docker socket isn't mounted on the agent. Check docker compose exec beszel-agent ls -la /var/run/docker.sock.
Container name in Beszel uses the actual Docker container name (e.g., immich_server, navidrome, homepage) — case-sensitive.

http://localhost:8090 doesn't work from the server (browser):

Same gotcha as the other stacks — bound to Tailscale interface IP only.
Fix: use http://127.0.0.1:8090 or https://beszel.z2mini.gabrielgabrie.com.