Flows

The four end-to-end paths bty supports. Pick by what infrastructure you have:

  • Direct flash - one-off provisioning, no server, USB live stick with the catalog baked onto its BTY_IMAGES partition.

  • USB + network catalog - USB live stick boots a target as in the direct path, but the catalog comes from a remote bty-web instance (commonly the ghcr.io/safl/bty-web Docker container on a teammate’s workstation). Same flash mechanics, shared catalog. No PXE.

  • Interactive PXE flash - server is up, operator picks an image from the bty wizard on first PXE contact (default for unknown MACs).

  • Server-driven PXE flash - fleet image flashing; machines reflash themselves on schedule / on demand / on failure.

Direct flash (USB live, offline)

Ad-hoc provisioning of a single box, no infrastructure on the network.

  1. Operator boots the target from the bty USB live image (built by bty-media).

  2. The live env auto-launches bty on tty1 via bty-on-tty1.service. Without bty.mac= on the kernel cmdline, the wizard runs in local-only mode: scans the BTY_IMAGES partition for local image files.

  3. Operator picks an image (Enter), picks a target disk (Enter), confirms the flash plan (y / Enter).

  4. bty writes the image and reports success.

  5. Operator removes the USB stick and reboots; the target boots into the freshly-flashed image.

The whole flow runs offline. No network, no server, no MAC registration.

USB + network catalog (bty --catalog SOURCE)

A middle shape between the offline direct flash and the PXE-driven flows. The operator boots from the same USB live stick but points the wizard at a network-shared bty-web for the catalog. Useful for a small team that wants one place for pre-built images without the full PXE deploy.

  1. Someone (operator’s workstation, a homelab server, a dev box) runs bty-web. The lowest-friction shape is the published container:

    docker run -d -p 8080:8080 -v bty-data:/var/lib/bty \
      ghcr.io/safl/bty-web:latest
    

    Pre-built images dropped into the volume show up in the /images endpoint. Any bty-web instance serves /images and works identically as a catalog source, whether run from a bare docker run or the full uvx bty-lab init stack.

  2. Operator boots a target from the bty USB live stick. bty auto-launches in local-only mode (no --mac on cmdline). On the first stage (SELECT_CATALOG) the operator picks [c] custom and types http://<host>:8080/catalog.toml. Or: relaunch the wizard with bty --catalog http://<host>:8080/catalog.toml from a shell on Alt+F2.

  3. The wizard fetches GET /catalog.toml, merges it with the local image-root, and advances to the image picker. Operator picks an image + a target disk, confirms. Image bytes stream from GET /images/{name} through curl | dd to the target disk - no temp file.

  4. On completion the operator removes the stick and reboots; the target boots into the freshly-flashed image. The server has no per-MAC record (this isn’t PXE), so no follow-up state to manage. The operator’s pick is never reported back: the catalog source is a one-way feed in this mode.

No PXE, no DHCP-proxy, no L2 broadcasts. The container can live anywhere reachable - operator’s laptop, an EC2 instance, anywhere with HTTP. The cost: the operator still has to plug in the USB stick and stand at the target.

Sub-case: virtual USB via IP-KVM (PiKVM, JetKVM)

The “USB live stick” in step 2 need not be physical. IP-KVM appliances (PiKVM, JetKVM, BMC IPMI virtual media, vendor-specific OoB consoles) can mount the bty .iso artifact and expose it to the target as a USB or CD-ROM device. The target boots into the bty live env exactly as from a physical stick; bty auto-launches on tty1; the operator types c, fills in http://<host>:8080/catalog.toml, picks an image, picks a target disk, flashes. The whole sequence runs through the IP-KVM session, no one at the rack.

Practical notes:

  • Use the .iso artifact directly (uncompressed since v0.25.4).

  • bty’s hybrid ISO works as either USB or CD-ROM; pick whichever your IP-KVM offers and the target’s BIOS/UEFI prefers.

  • Keystroke latency over IP-KVM is real; the wizard’s Enter-forward / Esc-back UX keeps per-step input minimal.

  • The bty live env’s tty1 framebuffer renders cleanly through every IP-KVM tested (PSF console fonts, no nerd-font / emoji dependencies). The plain-ASCII /etc/issue banner and the wizard’s Rich panels render identically over IP-KVM and locally.

This is what “bare-metal provisioning over the internet” looks like for a small fleet without PXE: a PiKVM at each site, a bty-web container somewhere with the catalog, an operator at home with a browser tab.

Sub-case: Ventoy multi-ISO stick

Ventoy replaces the bootloader on a USB stick with a menu that boots any .iso dropped onto its data partition. bty-usb.iso works there: boot the stick, pick bty-usbboot-pc-x86_64.iso from the Ventoy menu, and the target boots into the bty live env exactly as if dd’d directly.

Two ways to use Ventoy with bty:

  1. bty-usb plus a remote catalog. Same shape as the IP-KVM sub-case above: bty auto-launches, the operator presses c and types the bty-web URL. Ventoy is just a different boot mechanism; the catalog source is unchanged.

  2. bty-usb plus images on the same Ventoy partition. Drop .img.zst / .qcow2 / .img.gz files onto the Ventoy data partition next to the bty ISO. After bty boots, the partition is still attached to the host (it’s the physical USB stick the live env booted from). Mount it and point bty at the path via BTY_IMAGE_ROOT:

    # On the booted bty live env's tty1 (drop to a shell first):
    mount /dev/sdaN /mnt          # Ventoy data partition
    BTY_IMAGE_ROOT=/mnt bty
    

    No bty-web server needed for this variant - same self-contained shape as a stock bty-usb stick, just with Ventoy’s multi-ISO bootloader replacing the bty bootloader.

The BTY_IMAGES auto-mount relies on the partition label; Ventoy’s data partition is labeled Ventoy by default, so the auto-mount does not trigger. Either relabel that partition BTY_IMAGES (for auto-mount) or mount it manually as in option 2.

Interactive PXE flash (boot_mode=bty-tui)

The “bty-on-a-USB but over the network” path. Default behaviour for any MAC the server has never seen, so onboarding a new box needs zero per-MAC configuration.

  1. Operator brings up the bty-web container deploy: uvx bty-lab init /opt/bty (create + chown the dir first), set HOST_ADDR + passwords in /opt/bty/envvars, then podman compose -f /opt/bty/compose.yml --env-file /opt/bty/envvars --profile tftp up -d. The web UI is gated by $BTY_ADMIN_PASSWORD (unset = open, with a startup warning; rotate by changing the env var and restarting bty-web); point your LAN DHCP server (option 60/66/67) at the host using the Netboot page cheatsheet (bty serves TFTP via the sidecar but does not run DHCP).

  2. A target PXE-boots on the same segment for the first time. bty-web auto-discovers the MAC as boot_mode=bty-inventory (self-reports its disks, then boots the disk). To drive it with the interactive wizard instead, set boot_mode=bty-tui on the machine; bty-web then serves the iPXE-tui template (ipxe_tui.j2).

  3. The target chains into the bty live env with bty.server=URL + bty.mac=MAC on the kernel cmdline (the iPXE template carries nothing else; every other knob comes from the plan endpoint). bty-on-tty1.service takes over tty1 in place of the agetty and exec’s bty --server URL --mac MAC.

  4. bty auto-posts the local disk inventory to POST /pxe/{mac}/inventory on startup (no operator action). bty-web stores it on the machine row; the /ui/machines/{mac} page now shows a real path / model / serial dropdown for picking a target disk. Then bty GETs <server>/pxe/<mac>/plan and sees mode=interactive for boot_mode=bty-tui.

  5. bty drops into the wizard with the server’s catalog pre-loaded (GET /catalog.toml). The operator picks an image and a target disk, confirms the flash. Image bytes stream from GET /images/{name} through curl | dd to the target disk - no temp file, no intermediate download.

  6. On success bty POSTs /pxe/{mac}/done so last_flashed_at updates server-side. The image pick itself is NOT reported back: the machine’s bty_image_ref stays whatever it was (or null). For server-tracked flashes, set boot_mode=bty-flash-always with a bound ref + serial. The next reboot chains the wizard again unless the operator flips boot_mode.

This flow also suits the operator who wants a one-off remote flash without preparing a USB stick: any unknown MAC on the segment becomes a bty wizard session reachable via IPMI / serial console.

Server-driven PXE flash (boot_mode=bty-flash-always)

Fleet-managed provisioning, where targets are reflashed on schedule, on demand, or on failure.

  1. The bty-web deploy is already up (same setup as the interactive flow above).

  2. The target’s first PXE contact creates a Machine record with boot_mode=bty-tui. The live env runs bty on tty1, which automatically posts the box’s disk inventory to POST /pxe/{mac}/inventory on startup (no operator action).

  3. Operator assigns MAC -> image + target_disk + boot_mode in the web UI:

    • bty_image_ref (image binding) - picked from the catalog.

    • target_disk_serial (which disk to flash) - picked from the inventory dropdown populated in step 2.

    • boot_mode=bty-flash-always arms the auto-flash.

  4. Target machine PXE-boots; bty-web’s /pxe/{mac} returns the iPXE flash chain. Cmdline carries just bty.server + bty.mac; iPXE chains into the bty live env served over HTTP by bty-web.

  5. bty-on-tty1.service exec’s bty --server URL --mac MAC. bty GETs /pxe/<mac>/plan, sees mode=flash with the image URL + target_disk_serial filled in, resolves the serial to a /dev/... path via lsblk, fetches the assigned image from whatever URL the plan carries (withcache when configured + warm; the raw upstream oras:// or https:// otherwise), runs the flash, POSTs /pxe/{mac}/done to update last_flashed_at, then reboots automatically.

  6. The reboot lands back on PXE (PXE-first firmware). Because the box fetched the live-env artifacts during steps 4-5, boot_mode=bty-flash-always now serves a one-shot boot of the just-flashed disk (UEFI exit / BIOS sanboot) instead of the flash chain, so the freshly imaged OS boots. The next power cycle (no artifact fetch in between) serves the flash chain again - so a per-job CI cadence reflashes every cycle while still booting the image each time, no mode change. bty-flash-once works the same way but doesn’t re-arm, so after the one flash it keeps booting the disk - and it stays boot_mode=bty-flash-once throughout (see below).

  7. First-boot bring-up (users, network, packages, hostnames) is the pre-built image’s job, baked in via cloud-init / NoCloud user-data at image-build time. bty has no online provisioning step.

Both BIOS and UEFI clients are supported via iPXE.

Machine state model

Every machine record on bty-web carries five operator-controlled fields plus three timestamps the server maintains:

Field

Meaning

bty_image_ref

sha256 of canonicalised catalog src. Stable provenance ID; binds the image to flash.

labels

Free-form display tags (a set per machine; max 64 chars each, 16 per machine). Cosmetic; not consumed by the flash chain.

boot_mode

One of ipxe-exit / bty-flash-always / bty-flash-once / bty-tui / bty-inventory (PUT default ipxe-exit; auto-discovery default bty-inventory).

sanboot_drive

iPXE BIOS drive for the legacy-BIOS disk boot (e.g. 0x80; null = default first disk). BIOS only; on UEFI the box exits to firmware.

target_disk_serial

Operator-picked serial number from the most recent inventory post.

known_disks

JSON array of disks the live env’s bty reported on startup.

last_seen_at

Updated on every GET /pxe/{mac} hit.

last_flashed_at

Updated on every POST /pxe/{mac}/done.

known_disks_at

Updated on every POST /pxe/{mac}/inventory.

The boot_mode is the primary control knob; the rest provide the parameters the policy needs.

boot_mode values

Mode

What GET /pxe/{mac} returns

Mutates boot_mode?

ipxe-exit

ipxe_sanboot.j2 - iPXE boots the local disk: UEFI exits to the firmware boot order, legacy BIOS sanboot --drive <sanboot_drive> || exit. The PUT default.

No.

bty-flash-always

ipxe_flash.j2 for a fresh flash, then a one-shot disk boot (ipxe_sanboot.j2) on the contact after the live-env artifact fetch (alternates flash then boot). Refuses (falls back to ipxe.j2 exit) if no target_disk_serial.

No. The transient saw_flasher_boot bit drives the alternation.

bty-flash-once

Same chain + target_disk_serial gate as bty-flash-always, but the bit doesn’t re-arm: after the one flash, every later contact serves the disk boot.

No. Stays bty-flash-once.

bty-tui

ipxe_tui.j2 (live env chain; bty on tty1 GETs /pxe//plan -> mode=interactive, drops into wizard). bty auto-posts inventory on startup.

No.

bty-inventory

Alternates the live-env chain (plan mode=inventory: bty posts disks + reboots) then the disk boot (ipxe_sanboot.j2), via the same saw_flasher_boot bit as bty-flash-always. Re-collects inventory every cycle, so swapped hardware is found. The auto-discovery default.

No. Alternates via the bit.

boot_mode is the operator’s intent and is never rewritten by the server. bty-flash-once is the “reimage this box now, then leave it alone” pattern: it flashes on the next boot, then the saw_flasher_boot bit (armed when the box booted the flasher) makes every later contact boot the disk instead - while the record stays bty-flash-once. It differs from bty-flash-always only in that the bit never re-arms, so it won’t reflash again until the operator re-saves the machine. bty-flash-always re-arms each cycle: the per-job CI cadence.

Inventory + safety-gate flow

The target_disk_serial gate prevents “wrong disk wiped” incidents on multi-disk hosts. The full picture, in event order:

  1. First contact, no inventory yet. Operator powers on a new box. The firmware PXE-DHCPs, gets ipxe.efi via TFTP, runs the embedded chain script, fetches /pxe-bootstrap.ipxe from bty-web, chains to /pxe/{mac}. bty-web records the MAC (machine.discovered event), sets boot_mode=bty-inventory, returns the live-env chain (ipxe_tui.j2). Audit log gets a netboot.pxe.offered row with offer_kind=bty-inventory.

  2. Live env boots, bty starts. bty runs on tty1; on startup it shells out to lsblk and POSTs the result to /pxe/{mac}/inventory. bty-web stores the inventory as JSON on the machine row, updates known_disks_at, records machine.inventory. Fire-and-forget: failures land in the tty1 status bar but don’t block the operator.

  3. Operator opens /ui/machines/{mac}. The Target disk dropdown is now populated from known_disks, showing path / size / model / serial per disk. The operator picks one + binds an image + sets boot_mode=bty-flash-always.

  4. Operator power-cycles the target. Next PXE contact: /pxe/{mac} sees boot_mode=bty-flash-always, bty_image_ref bound, and target_disk_serial picked. Returns ipxe_flash.j2 with bty.server=

    • bty.mac= on the cmdline (the image URL + target serial come from the plan endpoint, not the cmdline).

  5. Live env flashes. bty on tty1 GETs /pxe/<mac>/plan, sees mode=flash with image + target_disk_serial filled in, shells out lsblk -o SERIAL, matches the serial to a path, runs the flash on that path, POSTs /pxe/{mac}/done (audit: machine.flashed), reboots.

The gate fires at multiple points:

  • /ui/machines/{mac} POST refuses boot_mode=bty-flash-always when target_disk_serial is empty. The form bounces to /ui/machines/{mac}?error=... so the operator sees a banner explaining how to fix it.

  • /pxe/{mac} refuses the flash chain when target_disk_serial is empty. Returns ipxe.j2 (local fallback) and records a netboot.pxe.flash.no_target_disk event so the operator can see on /ui/events why their box isn’t reflashing.

  • bty in auto-flash mode refuses when the plan’s serial doesn’t match any current disk. Prints a red Panel listing the current disks and serials, exits non-zero. The bty-on-tty1 service stays at the failed banner; the operator can re-pick on the server and retry.

The serial-match (vs path-match) at flash time is the durable guarantee: /dev/sda can flip to /dev/nvme0n1 across kernel versions, but the disk’s serial number is fixed.

Automated event-driven transitions

bty-web triggers a few automated mutations in response to HTTP requests from the live env. None require operator action.

POST /pxe/{mac}/done (live env signals completion)

Always:

  • Updates last_flashed_at + updated_at.

  • Records machine.flashed event with the requesting IP.

boot_mode is not touched - for any mode, including bty-flash-once. The mode is the operator’s intent and stays put; the saw_flasher_boot bit (armed when the box booted the flasher) is what makes the next PXE contact boot the disk instead of reflashing, so a finished bty-flash-once still reads bty-flash-once.

POST /pxe/{mac}/inventory (bty reports disks)

  • Replaces the entire known_disks JSON column with the new payload (no merge - the live env is authoritative for the box’s current disks).

  • Updates known_disks_at.

  • Records machine.inventory with the disk count + list of serials.

  • 404s if the MAC has no machine record (prevents a renegade bty from creating ghost machines).

GET /pxe/{mac} (firmware fetches the per-MAC chain)

Always:

  • Inserts or updates the machine row (machine.discovered fires on first contact; subsequent hits just touch last_seen_at + last_seen_ip).

  • Records netboot.pxe.offered with the offer kind so an operator can ask “what did the server hand back to MAC X at time T?” without debug logging.

Conditional:

  • netboot.pxe.flash.no_target_disk fires when boot_mode=bty-flash-always / bty-flash-once is set, an image is bound, the ref resolves, but target_disk_serial is empty. Distinct kind so the operator can filter for “why isn’t this reflashing?” cases.

  • netboot.pxe.flash.orphan_ref fires when boot_mode=bty-flash-always is set and an image is bound but the ref has no resolvable catalog_entries row. Different failure mode from no_target_disk; the binding itself is stale.

Audit log: event kinds by trigger

Kind

Fires when…

machine.discovered

A MAC not in machines hits GET /pxe/{mac}.

machine.created

Operator PUT /machines/{mac} for a MAC not yet recorded.

machine.upserted

Operator PUT /machines/{mac} for an existing record.

machine.deleted

Operator DELETE /machines/{mac}.

machine.flashed

Live env POST /pxe/{mac}/done.

machine.inventory

Live env POST /pxe/{mac}/inventory.

netboot.pxe.offered

Every GET /pxe/{mac} hit. details.offer records what was returned (flash / sanboot / tui / inventory / ipxe-exit) and details.reason annotates refusals (no_target_disk / orphan_ref).

netboot.pxe.plan

GET /pxe/{mac}/plan resolved a flash plan (image / target disk / boot args) for an auto-flash request.

netboot.flasher.armed

First /boot/<artifact>?mac= fetch in a cycle armed saw_flasher_boot=1 (the live env booted). Idempotent; only the 0->1 transition lands a row.

catalog.entry.added

Operator POST /catalog/entries (form or JSON) succeeds.

catalog.entry.add.failed

sha resolve / oras resolve / duplicate-src on POST /catalog/entries.

catalog.entry.deleted

Operator DELETE /catalog/entries.

catalog.entries.imported

POST /catalog/import or the form-style /ui/catalog/upload / /ui/catalog/fetch-release ingested a catalog.toml.

netboot.artifacts.fetch.requested / .started / .cancelled / .failed / netboot.artifacts.fetched

Lifecycle events for the release-artifact fetch worker (POST /workers/releases or POST /boot/releases); terminal success kind is bare netboot.artifacts.fetched.

settings.upstream.updated

Operator POST /ui/settings/upstream saved the release-repo / catalog-URL / release-tag overrides.

settings.backup.updated

Operator POST /ui/settings/backup saved the scheduled-backup knobs (enabled / cadence / retention).

settings.config.updated / .failed

Operator POST /ui/settings/config/edit round-tripped a bty.toml field through tomlkit.

backup.create.requested / .started / .cancelled / backup.created / backup.failed / backup.deleted / backup.pruned

BackupManager lifecycle; terminal success is bare backup.created. backup.pruned fires when retention deletes an old bundle after a successful run.

auth.login.succeeded

Operator POST /ui/login with a password matching $BTY_ADMIN_PASSWORD.

auth.login.failed

Same path with a mismatched password.

auth.logout

Operator POST /ui/logout from an authed session.

system.schema.reset

init_db rotated state.db on a version mismatch (auto-rotate on schema mismatch – see operations.md).

Every row carries subject_kind (machine / image / catalog / netboot / settings / auth / backup), a subject_id, the requesting source_ip, the actor (operator / pxe-client / system), and a JSON details blob with kind-specific extras.

Operator UI actions: a quick map

Action

UI path

What happens server-side

Log in

POST /ui/login

Constant-time compare against $BTY_ADMIN_PASSWORD -> session cookie. Records auth.login.{succeeded,failed}.

Log out

POST /ui/logout

Clears session cookie. Records auth.logout.

Bind image + disk + policy on a machine

POST /ui/machines/{mac}

UPSERT. Refuses boot_mode=bty-flash-always without target_disk_serial. Records machine.{created,upserted}.

Delete a machine record

POST /ui/machines/{mac}/delete

DELETE row. Records machine.deleted.

Add catalog entry by URL

POST /ui/catalog/entries

sha-resolve (if sha_url given) -> INSERT catalog_entries. Records catalog.entry.{added,add.failed}.

Delete a catalog entry

DELETE /catalog/entries?src=...

Removes the row. v0.40+: no on-disk cached bytes to clean up; withcache evicts on its own schedule. Records catalog.entry.deleted.

Upload a catalog.toml manifest

POST /ui/catalog/upload

Validates + atomic-renames into ${BTY_PATHS_STATE_DIR}/catalog.toml.

Fetch catalog.toml from the configured URL

POST /ui/catalog/fetch-release

GETs the operator’s [upstream] catalog_url (defaults to a pinned nosi release), same persist as upload.

Fetch boot artifacts (kernel + initrd + squashfs)

POST /ui/netboot/fetch-release

Pulls release artifacts into BTY_PATHS_BOOT_DIR. Lifecycle events netboot.artifacts.{requested,started,fetched,failed,cancelled}.

Save upstream sources (netboot repo / tag, catalog URL)

POST /ui/settings/upstream

Persists overrides into state.db (settings table); fetch routes resolve from this at request time. Records settings.upstream.updated.

Save scheduled-backup knobs (enabled / cadence / retention)

POST /ui/settings/backup

Same persistence; scheduler picks up changes on the next 60s tick. Records settings.backup.updated.

Edit a bty.toml config field

POST /ui/settings/config/edit

Per-row inline form on /ui/settings; rounds the value through tomlkit to preserve operator formatting + reloads the active config. Records settings.config.{updated,failed}.

Safety gates summary

Where bty-web refuses what the operator asked, and what the operator sees:

Gate

Trigger condition

Where it fires

Operator surface

Refuse flash chain without target_disk_serial

boot_mode=bty-flash-always/bty-flash-once, image bound, target empty.

GET /pxe/{mac}

netboot.pxe.flash.no_target_disk event; ipxe.j2 (exit to firmware).

Refuse boot_mode=bty-flash-always upsert without target

Form posts boot_mode=bty-flash-always and target_disk_serial="".

POST /ui/machines/{mac}

303 to /ui/machines/{mac}?error=... flash banner.

Refuse flash on serial mismatch at boot time

Live env can’t find a current disk whose serial matches the plan’s target_disk_serial.

bty auto-flash on tty1 (live env)

bty prints a red “No matching disk” Panel + non-zero exit; bty-on-tty1.service stays at the failed banner.

Refuse oversize catalog upload

/ui/catalog/upload body > 1 MiB.

POST /ui/catalog/upload

303 with ?error=...exceeded....

Refuse oversize boot artifact upload

PUT /boot/{name} body > BTY_TUNING_MAX_UPLOAD_BYTES (200 GiB).

PUT /boot/{name}

413 Content Too Large.

Refuse non-TOML catalog upload

Filename extension not .toml/.tml OR TOML parse fails.

POST /ui/catalog/upload

303 with ?error=... flash. On-disk manifest preserved on parse failure.

Refuse non-2xx catalog fetch-release body

HTTPError 404, URLError, TimeoutError, or non-TOML body.

POST /ui/catalog/fetch-release

303 with ?error=....

Refuse mismatched login

Submitted password does not match $BTY_ADMIN_PASSWORD.

POST /ui/login

Login form re-rendered with Invalid password.

Refuse unknown boot_mode

Pydantic pattern check on BOOT_POLICIES.

PUT /machines/{mac} + form sibling

422 (JSON) / 303 with flash (form).

Refuse path-traversal in upload {name}

..%2F or .. segments in PUT /boot/{name}.

_safe_path boundary check

400 / 404 / 405 depending on the request shape.