Skip to content

05 — Backup System

Daily versioned backups of /data/files, the Immich photo library, the music library, every self-hosted service's database and config, and the system configuration — to a 1 TB NVMe SSD over USB. Plus a manually-rotated off-site drive kept at family's place.


Architecture

/data/files/                              ← Samba share, user files
/data/docker/immich/library/              ← Immich UPLOAD_LOCATION (photos, uploads, transcodes, thumbs, DB dumps)
/data/music/                              ← Navidrome music library
/data/docker/<svc>/                       ← each service's compose + .env + config (excludes bulk data + live DBs)
        ↓ (rsync mirrors / sqlite3 .backup / pg_dump, nightly 3 AM)
/mnt/backup/current/
    files/             ← rsync mirror of /data/files/
    immich/            ← rsync mirror of /data/docker/immich/library/  (incl. backups/ = Immich's own DB dumps)
    music/             ← rsync mirror of /data/music/
    db-dumps/          ← refreshed every run via `sqlite3 .backup`: vaultwarden-db.sqlite3,
                          beszel-data.db, beszel-auxiliary.db, navidrome.db
    service-config/    ← per-service: immich/ navidrome/ radicale/ vaultwarden/ beszel/ homepage/ caddy/
                          openproject/ (incl. backups/*.sql.gz from nightly pg_dump + assets/)
        ↓ (cp -al, hard-linked snapshot)
/mnt/backup/daily/2026-05-12/             ← today's snapshot (dated by the run's START time)
                  2026-05-11/             ← yesterday
                  ...                     ← (last 7 retained)
        ↓ (cp -al, Sundays only)
/mnt/backup/weekly/2026-05-10/            ← Sunday snapshots
                   ...                    ← (last 4 retained)

/mnt/backup/system-state/                 ← system configuration backup (unchanged in mechanism)
    installed-packages.txt                ← all packages (for reinstall)
    manual-packages.txt                   ← manually-installed subset (now includes sqlite3)
    gabriel-crontab.txt                   ← user crontab
    system-config-2026-05-12.tar.gz       ← /etc configs (last 7 retained)

/mnt/backup/backup.log                    ← every run; rotated weekly by logrotate

Hard links explanation: each daily and weekly snapshot looks like a complete copy, but unchanged files share storage on disk between snapshots (they're hard links — different directory entries pointing at the same inode). So 7 daily + 4 weekly snapshots don't take 11× the base size — they take ~base size plus whatever changed between runs. A few new photos a day is tens of MB of churn, not hundreds of GB.

Snapshot dating: snapshot directories are named for the run's start time. A backup that crosses midnight is dated the day it started.

Why a full mirror tree, not just /data/files: before May 2026 the backup only covered /data/files and the /etc config bundle — Immich and the other self-hosted stacks were deliberately excluded because the old 500 GB drive didn't have room. The new 1 TB drive fits everything, so the scope expanded to cover the whole homelab (with the documented exclusions for bulk-regenerable data and live databases).


The backup drive

  • Device: Samsung 990 PRO 1TB NVMe SSD in an ASMedia ASM2462 USB enclosure (presents over USB 3.2 Gen 2)
  • Filesystem: ext4 with label backup; reserved-blocks set to 0 (tune2fs -m 0) to reclaim ~46 GB of the ext4 root-reserved space — this drive only ever holds backups, so the safety margin reserved blocks normally provide isn't needed
  • Mount point: /mnt/backup
  • UUID: 8795cb2e-fe34-4543-b93b-dd45b642846f
  • Usable space: ~916 GB
  • /etc/fstab entry:
    UUID=8795cb2e-fe34-4543-b93b-dd45b642846f  /mnt/backup  ext4  defaults,nofail  0  2
    
    nofail so the box still boots if the USB drive is disconnected.
  • smartd reference: the by-id path /dev/disk/by-id/usb-ASMT_2462_NVME_2504178506CB-0:0 — stable across USB re-enumeration, unlike /dev/sdX. See 06-drive-monitoring.md.

The 990 PRO is a refurbished ("good"-grade) marketplace purchase. It was verified genuine before being deployed: Samsung PCI vendor ID 0x144d, IEEE OUI 0x002538, model string "Samsung SSD 990 PRO 1TB", firmware 7B2QJXD7, SMART overall-health PASSED, 0% used, 100% available spare, ~12 power-on hours, 0 errors. A quick benchmark hit ~1056 MB/s read / ~1009 MB/s write — that's the USB 3.2 Gen 2 link saturating, not the drive's ceiling (in a PCIe 4.0 slot it does ~7.4 GB/s). USB is the bottleneck here, which is fine for a backup target.

Why the old T5 was retired (and where it went)

The previous backup drive — a Samsung Portable SSD T5 (500 GB, USB, ext4 label backup, UUID 0d44a454-471d-443d-af6c-3e5f422cfa38) — couldn't hold the full backup once Immich's ~520 GB library was in scope. It was reformatted (ext4, label backup-offsite, UUID 2c8e8e38-f129-4823-a1db-1529d3296b44, reserved-blocks 0) and demoted to the off-site rotation drive — see the off-site backup section below.

Swapping the backup drive (the hot-swap, May 2026)

Both drives are USB, so no shutdown was needed:

sudo umount /mnt/backup
# unplug the T5, plug in the 990 PRO enclosure
lsblk                                  # find the new device (e.g. /dev/sdb)
sudo wipefs -a /dev/sdb
sudo parted /dev/sdb mklabel gpt
sudo parted /dev/sdb mkpart primary ext4 1MiB 100%
sudo mkfs.ext4 -L backup /dev/sdb1
sudo tune2fs -m 0 /dev/sdb1            # reclaim the ~46 GB of ext4 reserved blocks
sudo blkid /dev/sdb1                   # get the new UUID
sudo nano /etc/fstab                   # replace the old UUID line with the 8795cb2e... line above
sudo nano /etc/smartd.conf             # replace the old by-uuid line — see 06-drive-monitoring.md
sudo mkdir -p /mnt/backup
sudo mount -a
sudo chown -R gabriel:gabriel /mnt/backup
sudo mkdir /mnt/backup/.beszel         # placeholder dir so the Beszel agent re-detects the new device — see 14-beszel.md
~/scripts/backup-files.sh              # first full run
docker restart beszel-agent            # so Beszel picks up the new "Backup" disk

Rollback files saved on the server (in case the swap needed to be reverted): /etc/fstab.pre-990pro, /etc/smartd.conf.pre-990pro, ~/scripts/backup-files.sh.pre-990pro.

Sizing after the first run + tune2fs -m 0: ~520 GB copied (the first full run took ~9 min over USB), ~916 GB usable → ~396 GB free. Subsequent runs are incremental — seconds to a couple of minutes.


Scope: what's backed up and what isn't

Backed up (covered by backup-files.sh)

Item Source Lands at How
User files /data/files/ current/files/ rsync mirror (--delete)
Immich photo library /data/docker/immich/library/ current/immich/ rsync mirror (--delete) — see note below
Music library /data/music/ current/music/ rsync mirror (--delete)
Vaultwarden DB data/db.sqlite3 (live, WAL mode) current/db-dumps/vaultwarden-db.sqlite3 sqlite3 ".backup" (online, safe while open)
Beszel hub DB data/data.db current/db-dumps/beszel-data.db sqlite3 ".backup"
Beszel auxiliary DB data/auxiliary.db current/db-dumps/beszel-auxiliary.db sqlite3 ".backup"
Navidrome DB data/navidrome.db current/db-dumps/navidrome.db sqlite3 ".backup"
Immich config .env, docker-compose.yml, docker-compose.yml.pre-cuda, hwaccel.ml.yml current/service-config/immich/ rsync (excludes library/, postgres/, model-cache/)
Vaultwarden config .env, docker-compose.yml, data/rsa_key.pem, data/attachments/, data/sends/, data/config.json current/service-config/vaultwarden/ rsync (excludes live data/db.sqlite3, data/tmp/)
Navidrome config .env, docker-compose.yml current/service-config/navidrome/ rsync (excludes data/ — DB is in db-dumps/, artwork/cache regenerable)
Radicale config + data .env, docker-compose.yml, config/, data/collections/ current/service-config/radicale/ rsync (files-as-truth — calendars/events are plain .ics, no DB; also pulls the harmless .Radicale.cache/)
Beszel config .env, docker-compose.yml current/service-config/beszel/ rsync (excludes data/ — DBs in db-dumps/, data/id_ed25519 is root-owned and unreadable; excludes agent-data/, socket/)
Homepage config .env, config/ current/service-config/homepage/ rsync
Caddy config Caddyfile, Dockerfile, .env, docker-compose.yml current/service-config/caddy/ rsync (excludes config/caddy/ and data/caddy/ — container-owned mode-700, unreadable; hold Caddy's autosave + ACME cert/key store, both regenerable)
OpenProject Postgres DB live db container (postgres:17) /data/docker/openproject/backups/openproject-db-<date>.sql.gz (last 14) → picked up by the openproject rsync into current/service-config/openproject/backups/ pg_dump -U postgres openproject inside the db container, piped through gzip (same pattern as Immich's built-in dump; we run it ourselves because OpenProject has no built-in auto-backup)
OpenProject config + assets .env, docker-compose.yml, proxy/, assets/ (uploaded attachments), backups/ (the .sql.gz from above), the rest of the upstream-cloned tree current/service-config/openproject/ rsync (excludes postgres/ — live PG data dir, captured via dump instead)
/etc config bundle Samba, smartd, msmtp, fstab, hostname, hosts, logrotate.d, sudoers.d, AppArmor overrides system-state/system-config-<date>.tar.gz the root-owned backup-system-state.sh wrapper (NOPASSWD)
/home/gabriel/scripts/ shell scripts and small binaries inside the /etc config bundle (unchanged)
Package lists + crontab dpkg --get-selections, apt-mark showmanual, crontab -l system-state/*.txt (unchanged)

Immich's photo library — what's in current/immich/. This is Immich's entire UPLOAD_LOCATION (/data/docker/immich/library/): library/ (~254 GB original photos), upload/ (~188 GB app uploads), encoded-video/ (~68 GB transcodes — included, deliberately), thumbs/ (~7.6 GB), backups/ (~979 MB — Immich's own DB dumps, see below), profile/. The rsync runs at 03:00, after Immich's own DB auto-backup at 02:00.

Immich's PostgreSQL database is captured via Immich's built-in dump, NOT a filesystem copy. The postgres/ data dir is owned by the container's uid and a live Postgres data dir rsync'd as a snapshot is corrupt. Instead, Immich's own nightly database auto-backup (Admin → Settings → Backup Settings — default ON, runs 02:00, keeps last 14) writes compressed dumps named immich-db-backup-YYYYMMDDTHHMMSS-v<ver>-pg<pgver>.sql.gz (~137 MB each) into library/backups/ — and the 03:00 current/immich/ rsync picks those up. The Immich "Database Backups" setting must stay enabled for this chain to work. Restore = follow Immich's official backup/restore docs (recreate the DB + extensions, then load the .sql.gz). See 11-immich.md.

The SQLite databases (Vaultwarden, Beszel ×2, Navidrome) are captured with sqlite3 "<live db>" ".backup '<dest>'" from the host's sqlite3 binary — a proper online backup, safe to run while the container holds the WAL-mode DB open. Never raw-rsync a live *.sqlite3 / *.db (or its -wal / -shm files) — you get a corrupt snapshot. This is why sqlite3 was added as a host package (sudo apt install sqlite3); it now shows up in system-state/manual-packages.txt.

Not backed up — by design

  • The Ubuntu OS itself (kernel, packages, /usr, /var, etc.). Recovery is "reinstall Ubuntu from ISO, restore configs from the system-state archive, restore data from /mnt/backup/current/." See 08-recovery.md.
  • Immich's postgres/ data dir — covered by the .sql.gz dumps instead (see above). Never rsync a live Postgres data dir — concurrent writes make the snapshot corrupt.
  • OpenProject's postgres/ data dir — same rule, same reason. Covered by the nightly pg_dump to /data/docker/openproject/backups/openproject-db-<date>.sql.gz, which the per-service rsync picks up.
  • Immich's model-cache/ — ML model weights, ~2-3 GB; re-downloaded automatically by the ML container.
  • Navidrome's data/artwork/ + data/cache/ — regenerated on the next library scan.
  • Caddy's config/caddy/ + data/caddy/ — container-owned (mode 700, unreadable by the backup user), and fully regenerable: config/caddy/ is Caddy's autosave, data/caddy/ is the ACME cert/key store — Caddy re-issues every cert via the Cloudflare DNS-01 challenge on restore. See 17-caddy.md.
  • Beszel's data/id_ed25519 — the hub's SSH key, root-owned (the backup user can't read it). Regenerable — but regenerating it means re-registering the one agent. Beszel's agent-data/ + socket/ — agent fingerprint/buffer (rebuildable) and a transient unix socket.
  • Vaultwarden's data/tmp/ — transient upload staging.

What changed about the framing: earlier versions of this doc described Immich as "not yet backed up — no off-site safety net" because the old 500 GB drive couldn't fit the library, and a /data/icloud-import/ staging tree that was deleted in May 2026. Both are gone: Immich now has an on-site backup (/mnt/backup/current/immich/, plus the DB dumps) — though note it's not on the off-site drive (the photo library is ~520 GB and the off-site T5 is 500 GB). That's a known, accepted gap until a >1 TB off-site drive exists.


The backup script

Location: /home/gabriel/scripts/backup-files.sh. Old version preserved as /home/gabriel/scripts/backup-files.sh.pre-990pro. Still the 3 AM nightly cron job — crontab unchanged.

What it does, step by step:

  1. Auto-remount — if /mnt/backup is not mounted (e.g. a transient USB disconnect), attempts sudo mount /mnt/backup to restore it before proceeding. Logs a NOTICE entry when this fires.
  2. Safety check — aborts if /mnt/backup is still not mounted after the remount attempt (prevents writing backups onto the OS drive if the USB drive is genuinely gone).
  3. Data mirrors — rsync /data/files/current/files/, /data/docker/immich/library/current/immich/, /data/music/current/music/. All with --delete (so files removed from the source are removed from the mirror) and --no-owner --no-group (the Immich library/ files and Radicale's data/ are owned by other uids; the backup runs as gabriel, which can't preserve those, so it doesn't try — the backup drive is single-user, and on restore you re-apply ownership or let the app fix it).
  4. SQLite online backupssqlite3 "<live db>" ".backup '<dest>'" for vaultwarden-db.sqlite3, beszel-data.db, beszel-auxiliary.db, navidrome.db, refreshing current/db-dumps/ every run.
  5. OpenProject Postgres dumpdocker compose exec -T db pg_dump -U postgres openproject | gzip > /data/docker/openproject/backups/openproject-db-<date>.sql.gz; rotates to last 14. Picked up by the openproject service-config rsync in the next step.
  6. Per-service config rsynccurrent/service-config/<svc>/ for each of immich, vaultwarden, navidrome, radicale, beszel, homepage, caddy, openproject — each with excludes for bulk data / live DBs / container-owned-unreadable subdirs (see the scope table above). These rsyncs additionally use --delete-excluded.
  7. Daily snapshot — hard-linked copy (cp -al) of current/daily/<YYYY-MM-DD>/ (dated by the run's start time).
  8. Weekly snapshot — Sundays only, hard-linked copy → weekly/<YYYY-MM-DD>/.
  9. Prune — delete daily snapshots older than 7 days, weekly snapshots older than 4 weeks.
  10. System state — capture package lists, crontab, and the /etc config bundle (via the existing root-owned sudo /usr/local/sbin/backup-system-state.sh NOPASSWD wrapper); keep the last 7 tarballs.

Writes to /mnt/backup/backup.log for every run.

Dependencies: the host sqlite3 package (for step 4) — sudo apt install sqlite3. Now in system-state/manual-packages.txt and listed in 08-recovery.md's package-reinstall step.

Permissions: no sudoers change — the script runs as user gabriel (who's in the docker group) and still uses only the two pre-existing NOPASSWD grants (mount /mnt/backup, /usr/local/sbin/backup-system-state.sh). The msmtp * grant is for the separate hourly mount-check script, not this one.

Keep Immich's "Database Backups" setting enabled. The Immich Postgres DB is captured only because Immich dumps it to library/backups/ nightly (02:00) and this script then rsyncs that directory into current/immich/backups/ (03:00). If you ever turn Immich's database auto-backup off, the photo files keep being backed up but the database stops being captured — a silent gap. See 11-immich.md.


The system-state wrapper script

Location: /usr/local/sbin/backup-system-state.sh

A small root-owned script that does the privileged operations the main backup script can't do as user gabriel:

  • Reads root-owned config files like /etc/msmtprc
  • Creates the system-config-<date>.tar.gz archive
  • Sets ownership of the archive to gabriel
  • Prunes old archives (keeps last 7)

Permissions are critical:

sudo chown root:root /usr/local/sbin/backup-system-state.sh
sudo chmod 755 /usr/local/sbin/backup-system-state.sh

If non-root users could modify this script, the passwordless sudo grant becomes a privilege escalation hole.


The mount-check script (hourly monitoring)

Location: /home/gabriel/scripts/check-backup-mount.sh

A small script that runs hourly via cron and emails you if /mnt/backup is not mounted. Silent when everything's fine; sends an alert email when there's a problem. Catches USB disconnect events between the daily 3 AM backup runs so you're notified within an hour rather than discovering it the next day.

The script is intentionally simple — just checks mountpoint -q /mnt/backup, and if that fails, sends an email via msmtp with troubleshooting steps.


The sudo grants

File: /etc/sudoers.d/gabriel-backup

gabriel ALL=(root) NOPASSWD: /usr/local/sbin/backup-system-state.sh
gabriel ALL=(root) NOPASSWD: /usr/bin/mount /mnt/backup
gabriel ALL=(root) NOPASSWD: /usr/bin/msmtp *

Three narrow grants for gabriel to run specific commands as root without a password:

  1. backup-system-state.sh — privileged wrapper that bundles /etc config files into the system-state archive
  2. mount /mnt/backup — used by the auto-remount safety net in the backup script
  3. msmtp * — used by check-backup-mount.sh to send alert emails (msmtp's config is root-readable only). The * wildcard is for the recipient email argument; sudo allows wildcards in non-path arguments.

Required because cron jobs run non-interactively and can't enter passwords. The expanded backup script (Immich, the SQLite dumps, the per-service config) needs no new grant — all of it runs as gabriel in the docker group.

To edit safely, always use:

sudo visudo -f /etc/sudoers.d/gabriel-backup

visudo validates syntax before saving, preventing lockouts.


Cron schedule

crontab -l

Should show:

0 3 * * * /home/gabriel/scripts/backup-files.sh
0 * * * * /home/gabriel/scripts/check-backup-mount.sh

Translation: - Daily backup at 3:00 AM (the off-site script is not in cron — see below) - Mount check every hour on the hour

To edit:

crontab -e

Off-site backup: the T5 at family's

The old 500 GB Samsung T5 was reformatted and demoted to the off-site rotation drive. It lives at Gabriel's parents' house and is brought in only to refresh the off-site copy during a family visit.

  • Device: Samsung Portable SSD T5 (500 GB, USB)
  • Filesystem: ext4 with label backup-offsite, reserved-blocks 0 (tune2fs -m 0)
  • UUID: 2c8e8e38-f129-4823-a1db-1529d3296b44
  • Mount point: /mnt/offsite (created on the server; no /etc/fstab entry — it's mounted manually only when the drive is plugged in)
  • Script: /home/gabriel/scripts/backup-offsite.shNEW, MANUAL (not in cron)

The ritual

When you're at the apartment with the T5 in hand (also documented in the script's header comment):

# 1. plug the T5 into the z2mini
# 2. mount it (by label — the device name doesn't matter)
sudo mount /dev/disk/by-label/backup-offsite /mnt/offsite
# 3. run the off-site sync
~/scripts/backup-offsite.sh
# 4. unmount
sudo umount /mnt/offsite
# 5. unplug the T5 and carry it back off-site

What backup-offsite.sh does

It copies, onto the T5 (~3 GB total):

  • /mnt/backup/current/files/ — documents
  • /mnt/backup/current/music/ — music library
  • /mnt/backup/current/db-dumps/ — the Vaultwarden vault DB dump, Beszel ×2, Navidrome
  • /mnt/backup/current/service-config/ — every service's config (Immich/Navidrome/Radicale/Vaultwarden/Beszel/Homepage/Caddy .env + compose, the Radicale calendars, the Vaultwarden rsa_key.pem + attachments + sends, etc.)
  • /mnt/backup/system-state/ — package lists, crontab, the /etc config tarballs

and writes /mnt/offsite/offsite-backup.log.

What it does NOT copy — the accepted gap

It does not copy /mnt/backup/current/immich/ — the ~520 GB photo library won't fit on a 500 GB drive. So the off-site drive protects documents, music, the DB dumps, every service's config (including the Vaultwarden vault DB dump and the Radicale calendars), and system-state against fire/theft — but the photo library has copies only at the apartment (/data live + /mnt/backup backup). That's a known, accepted gap until a >1 TB off-site drive exists.

There is an unexercised option to also fit the ~442 GB of irreplaceable photo originalslibrary/library/ + library/upload/ ≈ 442 GB — onto the 500 GB T5 by skipping thumbs/ / encoded-video/ / backups/. Deferred; Gabriel's call.


Common operations

Run a backup manually:

~/scripts/backup-files.sh

Useful before risky operations or when you want a fresh snapshot.

Check the log:

tail -50 /mnt/backup/backup.log

See what's been backed up:

ls /mnt/backup/current/                 # files/ immich/ music/ db-dumps/ service-config/
ls /mnt/backup/current/db-dumps/        # the SQLite online backups
ls /mnt/backup/current/service-config/  # one dir per service
ls /mnt/backup/daily/                   # daily snapshots
ls /mnt/backup/weekly/                  # weekly snapshots
ls /mnt/backup/system-state/            # config backups

Check backup drive space:

df -h /mnt/backup

Inspect a system-state archive:

tar -tzf /mnt/backup/system-state/system-config-2026-05-12.tar.gz

Run the off-site sync (T5 plugged in and mounted at /mnt/offsite):

~/scripts/backup-offsite.sh
tail -30 /mnt/offsite/offsite-backup.log

Restoring a deleted file

To recover a single file from a snapshot:

# Find the file in a snapshot — note the /files/ subdir (current/ is no longer a bare /data/files mirror)
ls /mnt/backup/daily/2026-05-11/files/path/to/your/file

# Copy it back to live storage
cp /mnt/backup/daily/2026-05-11/files/path/to/your/file /data/files/path/to/your/file
sudo chown gabriel:gabriel /data/files/path/to/your/file

You can also browse to \\z2mini\backup from Windows or iOS and copy the file out of the read-only share — the path inside is current\files\..., daily\<date>\files\..., or weekly\<date>\files\... — back into your live \\z2mini\files share.

The same pattern works for the other mirrored trees — daily/<date>/music/... for a music file, daily/<date>/immich/... for a raw photo file (though for Immich, restoring through Immich's own restore flow is usually the right move — see 11-immich.md).


Log rotation

The backup log is rotated weekly via /etc/logrotate.d/backup-files:

/mnt/backup/backup.log {
    weekly
    rotate 4
    compress
    missingok
    notifempty
    create 0644 gabriel gabriel
}

You'll find old logs as backup.log.1, backup.log.2.gz, etc. in the same directory.


Troubleshooting

Receiving "[ALERT] Backup drive unmounted" email:

The hourly mount check fired. Investigate:

ssh gabriel@z2mini
lsblk                           # is the drive present (sda or sdb)?
sudo dmesg | tail -50           # any USB disconnect/reconnect events?
sudo mount -a                   # try to remount
df -h | grep backup             # confirm mount succeeded

If lsblk shows the drive but with a different name than expected (e.g. sda instead of sdb), the drive disconnected and reconnected with a new device name. The fstab entry uses UUID (8795cb2e-...) and the smartd config uses the stable /dev/disk/by-id/usb-ASMT_2462_...-0:0 path, so this doesn't break either — but if you ever see smartd alerts about a "missing device," verify /etc/smartd.conf references the drive by that by-id path rather than /dev/sdX. See 06-drive-monitoring.md.

"ERROR: /mnt/backup is not mounted and remount failed":

The backup script tried to auto-remount and couldn't. The drive is genuinely disconnected or has a deeper issue.

lsblk                           # is the drive even visible?
sudo dmesg | tail -100          # what happened?

If the drive isn't visible at all: physical USB issue (cable, port, enclosure, drive failure).

If the drive is visible but won't mount: filesystem issue. Run fsck:

sudo fsck -y /dev/sdb1          # or whatever device name lsblk shows
sudo mount -a

"System configuration capture failed":

Check the wrapper script and sudoers file are intact:

ls -la /usr/local/sbin/backup-system-state.sh
sudo cat /etc/sudoers.d/gabriel-backup

The script should be owned by root:root mode 755. The sudoers file should have all three grants (backup-system-state.sh, mount /mnt/backup, msmtp).

A db-dumps/*.sqlite3 file is empty or zero bytes:

The sqlite3 ".backup" step failed — usually because the host sqlite3 binary isn't installed (sudo apt install sqlite3) or the source DB path is wrong (a service's compose layout changed). Check backup.log for the error around the db-dumps step. Don't "fix" it by raw-rsyncing the live .sqlite3 — that produces a corrupt copy; fix the sqlite3 invocation instead.

Backup drive filling up:

Check current usage:

du -sh /mnt/backup/*

Hard-linked snapshots typically use very little incremental space. If you're seeing rapid growth, you may have lots of changing files (a big new batch of photos, say). Worst case: reduce retention by editing the script's head -n -7 and head -n -4 values, or revisit the Immich encoded-video/ inclusion.

Backup hasn't run in days:

# Check cron is running
sudo systemctl status cron

# Check the cron log
sudo journalctl -u cron --since "2 days ago" | grep backup

# Verify the crontab entries are still there
crontab -l

Future improvements (not yet implemented)

  • A >1 TB off-site drive — so the Immich photo library (~520 GB and growing) can be off-site too, not just at the apartment. Right now backup-offsite.sh covers everything except the photos because the off-site T5 is only 500 GB. (Interim option: fit just the ~442 GB of irreplaceable photo originals onto the 500 GB T5 by skipping thumbs/transcodes/DB-dumps — deferred, Gabriel's call.)
  • Ansible playbook — more sophisticated alternative to the system-state tar approach. Defines the entire system configuration as code, lets you rebuild on any hardware. Would replace the tar approach with a cleaner, version-controlled equivalent.

(Resolved since the last revision: an off-site backup at family's place — now implemented as the T5 + backup-offsite.sh ritual above; and wiring Immich / Navidrome / Beszel / Radicale / Vaultwarden / Homepage / Caddy into the nightly backup — all done.)


Files and locations

Purpose Path
Main backup script /home/gabriel/scripts/backup-files.sh (rollback: …/backup-files.sh.pre-990pro)
Off-site backup script (manual, not cron) /home/gabriel/scripts/backup-offsite.sh
Hourly mount-check script /home/gabriel/scripts/check-backup-mount.sh
Privileged wrapper (system state) /usr/local/sbin/backup-system-state.sh
Sudo grants /etc/sudoers.d/gabriel-backup
On-site backup drive mount /mnt/backup (Samsung 990 PRO 1TB / ASM2462 USB enclosure, UUID 8795cb2e-…)
Off-site backup drive mount /mnt/offsite (Samsung T5 500GB, label backup-offsite, UUID 2c8e8e38-…; mounted manually)
Backup log /mnt/backup/backup.log
Off-site backup log /mnt/offsite/offsite-backup.log
Log rotation config /etc/logrotate.d/backup-files
Cron entries user crontab (crontab -l)
fstab rollback /etc/fstab.pre-990pro
smartd.conf rollback /etc/smartd.conf.pre-990pro
Host package dependency sqlite3 (for the online DB backups)
Recovery procedure /mnt/backup/RECOVERY-README.md