Operations: backup, upgrade, migrate¶
Looking after a running bty-web: back its state up, upgrade the software, and move it to new hardware (or a new host).
What counts as state¶
bty-web keeps everything in one directory, BTY_PATHS_STATE_DIR (default
/var/lib/bty; the bty-data named volume in the container deploy):
Path |
What |
Backup? |
|---|---|---|
|
The SQLite database: machine records, MAC->image assignments, catalog metadata, server settings, sessions, and the audit log. |
Yes – this is the irreplaceable bit. |
|
The netboot artifacts ( |
Optional – re-fetchable via “Fetch netboot artifacts”. |
|
The active catalog manifest. |
Optional – re-fetchable from the upstream. |
bty-web (v0.40+) holds no image bytes. Image bytes live in
withcache under
./data/withcache/, populated on first flash of each URL and
backed up independently.
A minimal backup is just state.db; a full backup is the whole
/var/lib/bty tree.
Data separation and read-only-OS readiness¶
bty-web is built so that all mutable runtime state lives under
BTY_PATHS_STATE_DIR (/var/lib/bty) – a single writable volume. In the
container deploy that volume is bty-data: the container image is
immutable and recovery is “pull a new image, re-attach the volume.” The
rest of this section is the readiness checklist for that split.
bty-web’s runtime writes already all land under /var/lib/bty, split
into two classes:
Path |
Class |
Notes |
|---|---|---|
|
precious |
records: machines, catalog, settings, audit log |
|
ephemeral |
netboot artifacts – version-coupled; refetch on a bty version bump |
|
regenerable |
cookie key |
Precious = carry across a migration / back up. state.db carries the
machine bindings + audit log + settings; v0.33.0+ auto-rotates it
on a version mismatch. The bty-web export bundle (v3, metadata-only)
carries the per-machine hardware identity (mac + hw_lshw +
known_disks) so a re-imported machine shows up pre-fingerprinted;
bindings reset and the operator re-binds.
Image bytes live in withcache (separate process, separate data dir, backed up independently). v0.40+ took bty-web out of the bytes plane; the live env streams from withcache or from the catalog URL’s origin, never from bty-web’s filesystem.
Ephemeral = safe to lose, re-created on demand. boot/ is the subtle
one: it lives on the writable volume (so a read-only OS is possible)
but is re-fetched when it no longer matches the running bty-web
version, rather than preserved as precious.
The container deploy already realises this split: the bty-web container
image is immutable, and /var/lib/bty is the bty-data named volume that
carries everything precious. $BTY_ADMIN_PASSWORD is supplied via the
container env rather than written into the image. Pulling a new image and
re-attaching the volume is the whole upgrade.
Backup¶
state.db is a single SQLite file. The safe way to copy a live database is
SQLite’s online backup (consistent even while bty-web is running):
sqlite3 /var/lib/bty/state.db ".backup '/tmp/bty-state-$(date +%F).db'"
A plain cp also works if bty-web is stopped first:
sudo systemctl stop bty-web
cp -a /var/lib/bty/state.db ~/bty-state-backup.db
sudo systemctl start bty-web
For a full backup of everything bty-web manages (records + netboot
artifacts + any on-disk backup bundles), copy the whole directory while
bty-web is stopped:
sudo systemctl stop bty-web
sudo tar -C /var/lib -czf ~/bty-state-$(date +%F).tar.gz bty
sudo systemctl start bty-web
This does NOT include cached image bytes – since v0.40, bty-web is out of the image-bytes plane; cached blobs live in the separate withcache data dir (its own container volume). Back that up independently if you need the cache to survive (otherwise withcache re-fills on demand from the upstream catalog).
Restore by putting the file(s) back under /var/lib/bty (bty-web
stopped) and starting the service.
Scheduled backups (UI-driven, since v0.25.7)¶
The /ui/backups page carries a Back up now trigger plus a
Schedule card on /ui/settings#backup-schedule for cadence
(daily / weekly / manual) + retention (keep N most recent).
The scheduler ticks every 60s; a change in Settings takes effect
on the next tick without restarting bty-web.
Each backup is a directory written under $BTY_PATHS_BACKUP_DIR
(default $BTY_PATHS_STATE_DIR/backups) named after the ISO-8601
timestamp, e.g. 2026-05-24T08-00-00Z/. The bundle layout is
identical to what bty-web export produces (a single
inventory.json carrying per-machine mac + hw_lshw +
known_disks), so a scheduled backup is interchangeable with a
manual one. Image bytes are NOT included – bty-web doesn’t have
any (v0.40+); withcache holds the cached blobs independently.
Retention prunes the oldest siblings after every successful run.
Two env vars tune the feature when the in-UI knobs aren’t enough:
Variable |
Default |
Meaning |
|---|---|---|
|
|
Where backup directories land. Move off the OS disk if you want them to survive an OS reflash. |
|
|
Max concurrent backup jobs. Concurrent exports race on dest dirs; leave at 1 unless you have a reason. |
History lands in the audit log under subject_kind=backup (kinds
backup.created / backup.failed / backup.pruned); the
/ui/backups page also surfaces the recent rows in a card at
the bottom.
Portable export / import (operator data only)¶
tar-copying the whole tree (above) is the verbatim option. The
bty-web export / import subcommands are the slim one: they move
only the expensive-to-recollect half – per-machine hardware identity
(mac + lshw + known_disks from the box’s last live-env boot) –
and nothing else. The catalog, machine bindings, audit log, settings,
and netboot artifacts are deliberately left behind so an upgrade lands
on a fresh, regenerable state and the operator re-binds. Reach for
this to migrate hardware fingerprints across an upgrade or to a new
host without dragging the rest along.
# On the old server (reads BTY_PATHS_STATE_DIR):
bty-web export /tmp/bty-bundle
# Copy /tmp/bty-bundle to the new server, then:
bty-web import /tmp/bty-bundle
The in-UI Back up now trigger on /ui/backups produces the
same bundle shape; reach for the CLI when scripting (cron / a
podman exec into the container / packaging into an archive pipeline)
and the UI when you want an ad-hoc snapshot without leaving the browser.
What a bundle carries, and what it deliberately leaves behind:
Travels |
Stays behind (fresh on the destination) |
|---|---|
Machine |
Boot mode (every machine imports as |
|
Image binding + |
|
The |
The image catalog ( |
|
The netboot artifacts (re-fetch to match the new version) |
|
Server settings + the audit log |
Resetting the boot mode is the point: a freshly-migrated machine
shouldn’t auto-flash against netboot artifacts you haven’t refreshed
yet. Each box arrives as a re-discovered bty-inventory box with its
hardware + binding pre-filled; you re-enable a flash mode once the new
server is verified and its netboot artifacts re-fetched.
A bundle is a plain directory (a single inventory.json), so
tar it – or just cp – for archival.
Upgrade¶
bty pre-1.0 has no database migration framework. The DB carries
the exact bty.__version__ that created it in a bty_version
table. When the running release doesn’t match, bty-web automatically
rotates the old state.db to state.db.<from>.<ts>.bak and creates
a fresh one in its place. Every release is therefore breaking for
state, by design – but the operator does nothing.
Auto-rotate on schema mismatch (v0.33.0+)¶
On bty-web startup, if the stored bty_version disagrees with the
running release (or the DB is pre-versioning – data tables present
without the marker), init_db does:
Renames
state.dbtostate.db.<from-version>.<UTC-iso>.bak(e.g.state.db.0.27.4.20260525T101530Z.bak). The old DB is preserved on disk for forensics.Unlinks the WAL sidecars (
state.db-journal/-wal/-shm) so the fresh DB doesn’t pick up stale pages.Creates a fresh
state.dbwith the running release’s schema, stamped withbty.__version__.Records a
system.schema.resetevent with details{from_version, to_version, archived_at}. The event surfaces as an unacknowledged tripwire on/ui/dashboard; acknowledge it from/ui/events.
Operator-irreplaceable state lives outside state.db:
Netboot artifacts under
BTY_PATHS_BOOT_DIR– not touched.Backup bundles under
${BTY_PATHS_STATE_DIR}/backups/– not touched.Withcache blobs under the separate withcache data dir – not touched (different process).
What rotation discards: machine bindings, the audit log, operator-overridden settings, the catalog cache index. Bindings re-discover on the next PXE contact from each machine.
Preserve hardware inventory across an upgrade¶
If you want MAC + lshw + known_disks to survive the rotation,
export before upgrading and import after:
# Before upgrade: snapshot to a portable bundle.
sudo bty-web export /var/lib/bty/backups/pre-$(date +%Y%m%d)
# Upgrade bty-web (pip / pipx / container image pull), then:
sudo bty-web import /var/lib/bty/backups/pre-$(date +%Y%m%d)
The slim bundle carries a minimal per-machine record
(mac + hw_lshw + known_disks) and nothing else: bindings
(boot_mode, bty_image_ref, target_disk_serial, sanboot_drive,
labels) reset on import and the operator re-binds; the image catalog
(catalog_entries) does not travel either – re-import the catalog
on the new appliance via the Settings page’s “Fetch latest catalog”
button (or upload a catalog.toml directly). See “Backup”.
Recovering an old .bak¶
The rotated DB is a normal sqlite file. Read it with the sqlite3
CLI to recover specific rows:
sqlite3 /var/lib/bty/state.db.0.27.4.20260525T101530Z.bak \
"SELECT mac, bty_image_ref, boot_mode FROM machines"
Once you no longer need it, rm it like any other file.
Upgrade in place (pip / pipx install)¶
If you installed bty-lab directly:
pipx upgrade bty-lab # or: pip install -U bty-lab
sudo systemctl restart bty-web
Re-fetch the netboot artifacts after upgrading. The live-env
artifacts in BTY_PATHS_BOOT_DIR (kernel / initrd / squashfs) are versioned
and fetched separately from bty-web – the package upgrade does NOT
touch them. So a freshly-upgraded server keeps serving the previous
live env until you refresh it: open /ui/netboot and click Fetch
latest artifacts (or pin a tag under Settings -> Upstream sources
first). Skip this and PXE clients boot the old live env against the new
server – a confusing version split.
Upgrade the container deploy¶
In the container deploy the upgrade is a single bty-lab upgrade call.
It regenerates compose against the CLI’s bty version (image-tag pin moves
forward), preserves envvars + data/, pulls new images, and restarts
the stack – auto-detecting whether to drive that via podman compose up -d (plain) or systemctl restart (Quadlet-managed):
uvx bty-lab upgrade /opt/bty # the dir you bootstrapped with `init` / `deploy`
For step-by-step control, run the pieces manually (re-emit, then pull + restart):
cd /opt/bty
uvx bty-lab init --force . # regenerates compose.yml against newer bty
podman compose --env-file envvars --profile tftp pull
podman compose --env-file envvars --profile tftp up -d
AutoUpdate=registry plus podman-auto-update.timer automate the pull
step for the Quadlet variant (init --systemd). After the pull, re-fetch
the netboot artifacts (open /ui/netboot -> Fetch latest artifacts) so
PXE clients boot a live env matching the new bty-web version. See
deploy/README.md.
Migrate to a new host¶
Stop bty-web on the old host, copy the deploy directory’s data/ tree
(or /var/lib/bty for a host install) to the new host, and start bty-web
there. The MAC->image assignments and audit log come with it; only the
host’s own IP changes. Re-point your LAN DHCP at the new host’s IP and
re-fetch the netboot artifacts on the new instance.
Recovering from a failed or interrupted flash¶
A flash writes directly to the target disk; bty has no rollback. If a flash fails partway - network drop, integrity mismatch, operator Ctrl+C, a wedged disk - assume the target holds partial, unbootable data and re-flash it from a trusted source. There is nothing to clean up first: the next flash overwrites from byte 0.
Integrity mismatch (
FlashIntegrityError): the streamed bytes did not match the source’s digest. The disk was already written (a stream can’t be checked before it’s written), so it is suspect. Re-flash from a source you trust; if it recurs with the same source, the upstream artifact or its published digest is wrong.Interrupted download / cancel: re-run the flash. For a server-driven PXE box, just let it boot again - the plan re-flashes.
Stuck on the live env after a crash mid-flash: if a machine fetched its boot artifacts but never POSTed
/pxe/{mac}/done, itssaw_flasher_bootbit stays set and it keeps booting the flasher. Re-save the machine record in/ui/machines(or fixboot_mode) to clear the state.