30 Commits

Author SHA1 Message Date
librelad
fa47e16cab feat(updater): automatic background scan for versions, CVEs & improvements
Replace the click-to-scan-only flow with a self-throttled auto-scan that
rides the existing task-processor idle poll (the same shape as the
network-drift check — no new daemon, unit, or endpoint):

- 'libreportal updater check auto' gates on the age of the generated
  updates.json vs CFG_UPDATER_SCAN_INTERVAL (minutes, default 30,
  0 disables); a fresh file makes the 60s tick a single stat() + return.
  Manual checks and post-update rescans reset the clock for free, and a
  missing file means the first scan runs ~a minute after install.
- Eligible signed hotfixes keep flowing through artifactApplyAuto, which
  only enqueues ordinary tasks — mutations stay on the task path.
- Open updater surfaces (standalone /updater and the fleet Overview's
  headless UpdaterPage) follow along with a 60s static-JSON re-read that
  repaints only when a generated_at stamp changed; timer released via
  dispose() on unmount, ticks skipped while hidden.
- Empty states now say the first scan happens automatically; Check now
  stays as the immediate manual override.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-06-12 22:07:42 +01:00
librelad
376610cd11 feat(apps): scoped multi-instance support (run two of an app)
Lets a *multi-instance-capable* app run as several fully isolated instances
on one box (e.g. two Bookstack/WordPress sites, or a "family" + "work"
Nextcloud) — distinct data, DB, subdomain, backups and update cadence.

Design: an instance is just another app. It gets its own slug (<type>_<id>),
its own CFG_<SLUG>_* namespace, deployed dir, DB row, IP/port allocation and
host, so the entire existing pipeline (scan, install, services, routing,
updater, backups) treats it like any app with zero changes. All
instance-specific rewriting is confined to a clone of the type's template;
the shipped template and the core engine are untouched.

Gating: opt-in per app via CFG_<TYPE>_MULTI_INSTANCE=true. Only Bookstack
carries it for now (the validated reference). The other 31 apps are
unaffected — the feature is invisible unless the flag is present.

- scripts/instance/instance_create.sh — clone + re-namespace config, rewrite
  compose identity (container_name / Traefik routers / backup labels) and
  per-app tools, set a hostname-safe subdomain (PORT field 10), then hand off
  to dockerInstallApp. Plus instanceList / instanceRemove.
- libreportal instance create|remove|list — new CLI category; mutations route
  through the task system (no new mutating API endpoint).
- WebUI: "instance of <type>" badge + a "New instance" card action on capable
  apps, and a create modal (name + domain# + subdomain, live host preview)
  that dispatches the standard task. Capability/instance-of read straight off
  the already-exposed app config.

Known follow-ups (documented): flip the flag on more apps after a compose
identity check (Nextcloud next); per-app tools are best-effort isolated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-06-04 23:34:52 +01:00
librelad
20f8ca2eb5 feat(network): detect + heal apps stranded off the docker subnet
Closes the gap behind the vpn-recreate bug: when the shared network is
recreated with a different /24, every app's stored static IP is left
outside it and adoptDockerSubnet only realigns CFG, not the apps.

- networkScanConflicts (network_conflicts.sh): read-only scan diffing each
  active network_resources IP against docker's real subnet (via ipInSubnet).
  Per-service routing-aware — skips gateway-routed services whose ipv4 is
  commented out in the deployed compose, so gluetun apps don't false-positive.
  Distinguishes 'daemon down' (benign) from 'network missing' (real).

- webuiSystemNetworkCheck (webui_system_network.sh): self-throttled generator
  that writes frontend/data/system/network_status.json (modelled on
  verify_status.json). Wired into webuiSystemUpdate AND run unconditionally
  every ~60s from the task-processor poll (regen webui is mtime-gated and
  would never fire on drift, which touches no source file).

- networkHealConflicts (network_heal.sh) + 'libreportal system network
  check|heal [app]': the heal adopts docker's subnet in-process, then re-IPs
  stranded apps with reset_network=ip (ports preserved), gluetun first.
  Mutating path runs only through the task system (dual-mode, like update
  apply); read-only check runs inline.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 16:03:53 +01:00
librelad
96b04392dc feat(distribution): Phase 3 — hotfix scan generator + severity-split auto-apply
- CFG_HOTFIX_AUTO (security-breakage|all|off, default security-breakage) seeded in
  general_terminal; reaches existing installs via the add-only config reconciler.
- webui_artifact_scan.sh (webuiArtifactScan): fetch+verify the signed index, write
  artifacts_available.json ATOMICALLY (build in temp → jq-validate → one write;
  keep the prior file on any failure — never emits broken JSON). Annotates each
  artifact with applied (a per-id record exists) + applicable (target installed).
- artifactApplyAuto + `libreportal artifact apply-auto`: enqueue apply tasks for
  the eligible signed hotfixes — only when the index is VERIFIED-signed, only
  auto==true + in the severity policy + applicable + not already applied. Each
  apply is its own task (visible in the log + History), never applied inline.
- `updater check` now also refreshes the index (webuiArtifactScan) and runs
  artifactApplyAuto — one front door, no second phone-home.

Unit-tested 13/13: policy filtering (security-breakage / off / all), auto:false
exclusion, already-applied skip, non-installed-app skip, unsigned-index fail-closed,
and the scan transform's signed/applied/applicable fields.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 20:53:54 +01:00
librelad
a27304a191 fix(distribution): harden the artifact apply pipeline (adversarial review)
A 4-lens adversarial security review of the Phase 2 applier raised 19 issues
and confirmed 17 after per-finding verification. All are trust-boundary (they
require the signing key), but several break the explicit "no code-exec, always
reversible, nothing-silent" contract, so all 17 are fixed:

Trust path — fail CLOSED, never misreport:
- lpFetchIndex now surfaces the real signature state (LP_INDEX_SIGSTATE);
  artifactApply REFUSES to mutate unless the index is actually verified, and
  _artifactFetchPayload refuses an unsigned payload. The read path still
  tolerates dev/unsigned but now says "UNSIGNED" instead of "Signed + verified".
- valid_until and index_serial are now MANDATORY + numeric in lpFetchIndex
  (missing = refuse) — closes the anti-withholding / anti-rollback fail-opens.

Injection / code-exec (defense in depth even for a signed payload):
- runFileWrite rootless branch no longer builds a `bash -c` shell string with the
  destination interpolated — it uses the argv form (like runFileOp), so a path
  with a quote can't inject a command as the install user. (shared-helper fix)
- op paths must match a safe-filename charset (no quotes/$/backtick/;/newline);
  set-config-key values and set-compose-image refs are charset-guarded too.
- content_b64 is validated as real base64 at precheck.

Reversibility / honest failure:
- dockerComposeUp now returns the real compose exit status (it always returned 0,
  so the updater's rollback gate AND the apply's start-failure detection were
  fail-open). (shared-helper fix)
- set-config-key undo captures the WHOLE config file (lossless) instead of a
  lossy re-parsed scalar; edit-only (rejects an absent key).
- _artifactReplayUndoFile returns non-zero if any inverse op fails; auto-rollback
  and revert now record "rollback-incomplete"/"revert-incomplete" + isError
  instead of falsely claiming success, and revert keeps the record for retry.
- applied-record write failure is checked — apply rolls back rather than leave an
  un-revertable change. System-scope regen failure is no longer swallowed.
- Writes are path-aware (configs/ -> runInstallWrite, container tree ->
  runFileWrite) so system-scope hotfixes write/restore correctly.
- Checked lazy-sourcing surfaces a clear error instead of a bare exit 127.

Unit-tested 35/35 (adds: command-sub value rejection, bad image-ref, invalid
base64, quote/metachar path-injection rejection, replay-failure reporting).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 20:47:18 +01:00
librelad
2df4e28a85 feat(distribution): Phase 2 — artifact apply/revert pipeline + ops interpreter
The mutating side of the unified distribution primitive (spec §8.3). Hotfixes
can now be applied and reverted, first-party, through the task system.

New scripts/cli/commands/artifact/cli_artifact_apply.sh:
- artifactApply <id>: resolve+gate (applies_when / min_lp / max_lp /
  max_footprint / publishers-map role) → fetch+verify payload (sha256 pinned by
  the signed index + minisig) → dry-precheck ALL ops (all-or-nothing) → best-
  effort snapshot → apply each op recording a precise inverse → bring app up →
  auto-rollback (replay undo LIFO, snapshot fallback) → applied-record + History.
- artifactRevert <id>: replay the applied-record's undo log (LIFO).
- Bounded, CLOSED op vocabulary (no run-script/exec, ever): set-config-key,
  set-compose-image, patch-file-if-checksum-matches, set-data-file. An
  unsupported op rejects the whole artifact at precheck (fail-closed).
- Write-target firewall: scope:app → containers/<app>/ only; scope:system →
  configs/ only; the install tree (our code) is off-limits to hotfixes (fork 1).
  Drift guards (expect_current / checksum) skip cleanly rather than clobber.
- Two-tier trust: index minisig-verified vs the footprint key (lpFetchIndex)
  covers the envelope; payload sha256-pinned + minisig-verified; publishers-map
  role gate (a non-official publisher can't claim official). Community per-
  artifact-key sigs are gated off until that tier is enabled.

cli_artifact_commands.sh: apply/revert via the task system (artifact_apply /
artifact_revert types — no allowlist needed), + read-only `applied` list.

cli_updater_commands.sh:
- FIX verified safety bug: updaterApplyApp/RollbackApp called `libreportal backup
  app "$app"` and `... restore latest`, which parse the app name as the ACTION,
  hit the dispatcher's `*)` default (exits 0) — so updates ran with NO snapshot
  and rollback was a silent no-op. Call backupAppStart / restoreAppStart directly.
- FIX updaterRecordHistory jq-silent-skip: was `command -v jq || return 0`
  (silently dropped the audit entry). Now fail-closed with a brace-agnostic
  bash-native prepend fallback; extended with artifact_id/serial/undo_id.

fetch.sh: add _lpJsonEsc (shared JSON-escape for the jq-free fallbacks).
Regenerated source arrays + lazy-load manifest for the new file/functions.

Unit-tested 31/31: every op apply+precheck+undo round-trip, the path-allowlist
firewall (incl. .. traversal + install-tree + cross-app rejection), all-or-
nothing abort, unsupported-op rejection, and the History bash-native fallback
(records + preserves prior entries without jq). A full signed-apply e2e needs
minisign + the signing key (Phase 5 make_hotfix.sh).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 20:01:11 +01:00
librelad
caee74bd76 feat(distribution): signed artifact-index fetch+verify primitive (Phase 1)
Build the read side of the unified distribution primitive from
docs/roadmap/updates-and-distribution.md: one team-signed catalog
(index.json) on the same channel as latest.json, listing type-tagged
artifact envelopes. A hotfix is the first artifact type; apps/themes/
components are future envelope rows through the SAME pipe — the
marketplace seam is just the `type` + `payload.kind` fields.

Phase 1 is fetch + verify + parse only (NO mutation; the snapshot →
ops → rollback → History apply verb is Phase 2):

- Factor `lpVerifyMinisig` out of `lpFetchRelease` (scripts/source/
  fetch.sh) — one trust anchor (the root-owned footprint key) now
  shared by releases and the index; refactor `lpFetchRelease` to use
  it (behaviour-preserving, still fail-closed).
- scripts/source/artifacts.sh: `lpFetchIndex` — download →
  verify-before-parse → `valid_until` freshness (anti-withholding) →
  `index_serial` monotonic high-water (anti-rollback, TUF-lite) → emit
  verified JSON. Trust core is jq-free; parsing accessors prefer jq
  with a grep fallback.
- `libreportal artifact index` (scripts/cli/commands/artifact/) —
  read-only front door that fetches, verifies and lists. Runs directly
  like `updater check` (no task; no mutation).
- Regenerate the source arrays + lazy-load function manifest for the
  new files.

Doc: promote the format from vision to spec (§8) — 3 layers
(INDEX/ENVELOPE/PIPELINE), the bounded declarative op vocabulary (no
run-script, ever), the apply pipeline mapped onto existing functions,
the marketplace seam, and resolutions for all five open forks.

Self-tested 12/12: trust core fails closed (real key + no minisign →
refuse), happy path, stale-refused, rollback-refused, signature-refused,
jq + grep parsing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 16:48:06 +01:00
librelad
f49455e38e fix(de-sudo): route all confirmed container-tree writes through the privileged path
Exhaustive audit (workflow: 19 finders + adversarial per-file verify; 85 raw ->
66 unique -> 39 confirmed) found 36 direct writes into the container-owned tree
that bypass runFileOp/runFileWrite/runCfgOp (manager => EACCES in rootless) plus
3 $?-masking sites. Fixes by area:

- apps: grafana + prometheus install hooks (sudo chmod -> runFileOp chmod);
  gluetun provider etag (tee -> runFileWrite).
- webui generators: task-create (10 sites: mkdir/chown/tee/jq|tee/sed|tee ->
  runFileOp/runFileWrite); app-icons (mkdir/cp/mv); config icon cp; system
  metrics + update throttle stamps (runAsManager touch -> runFileOp touch);
  setup-lock rm; updater history seed + cp.
- task health checker: 4 log writes (tee -a -> runFileWrite -a) + 3 find -delete
  (-> runFileOp find).
- config reconcile: backup cp -> runCfgOp; live cp -> runFileWrite < tmp for
  container-owned configs (the container user can't read a manager 0600 tmp).
- peer pull: tar extract into the container tree -> runFileOp tar.
- masking: ip_find_available + folder_group(x2) — split 'local VAR=$(cmd)' so $?
  reaches the following [[ $? ]] check.

15 files, all pass bash -n; fixed idioms confirmed gone.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 03:50:48 +01:00
librelad
daa336449a feat(updater): backend — data generator + 'libreportal updater' CLI with DR
- scripts/webui/data/generators/updater/webui_updater_scan.sh (webuiUpdaterScan):
  writes frontend/data/updater/generated/{updates,cves,history}.json from the
  installed-apps DB (current image per app from compose). Available-version +
  CVE-scanner are clearly-marked pluggable hooks; always emits valid JSON.
- scripts/cli/commands/updater/{cli_updater_commands.sh,cli_updater_header.sh}:
  auto-dispatched as 'libreportal updater <sub>' (check/apply/apply-all/rollback).
  apply does disaster-recovery FIRST — snapshots the app via the backup engine,
  then pulls + recreates (real dockerComposeUp/compose-pull helpers), records
  history, and auto-rolls-back on failure. Standard LIBREPORTAL_TASK_EXEC
  enqueue/exec split so WebUI + CLI share locking + audit trail.

New .sh files: the array/function-manifest regen self-heals on deploy; the
check path also sources its generator on demand to cover the gap.

NOTE: host-side bash — written to the repo's conventions but not runnable in
this env; this is the surface to test (the WebUI feature is lp-shot-verified).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-30 03:13:26 +01:00
librelad
9ca5cc6c7c feat(system): full, deletable images list on the Storage page
Replaces the read-only "Largest images" top-10 table with a Tasks-style list of
ALL Docker images, with select-one / select-multiple / clear-all removal that
mirrors the Tasks page UX (row checkboxes, master select-all, a button that
morphs Clear All ↔ Delete Selected (N), an eo confirm modal).

Deletion routes through the task system, NOT a new web API: a new
`libreportal system image rm [--force] <ids>` CLI subcommand (validates each
ref, loops runFileOp docker image rm, reports a tally) is invoked via the
system_image_rm task action — same pattern as Reclaim. The web backend change
is read-only (uncap the existing /storage image list). In-use images are
skipped by default with an opt-in "force-remove" toggle (warned). The page
stays put, toasts, and refreshes on the task's completion event.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-28 21:32:29 +01:00
librelad
b28268a61f feat(system): "Verified" integrity check against the signed release manifest
Adds per-file integrity attestation on top of the existing signed-tarball
release flow. make_release now generates a SHA256SUMS manifest over the shipped
tree and (when a key is configured) signs it, riding both inside the release
tarball so they land in the install tree with no extra download.

lpVerifyInstall (scripts/source/verify.sh) re-hashes the install tree against
that manifest and verifies the manifest's minisign signature against the
root-owned footprint pubkey, yielding states: verified / modified / tampered /
unsigned / unverifiable / development. webuiSystemVerify writes verify_status.json
(throttled daily, force on demand, also after each update apply), surfaced as an
Integrity line + "Verify now" button on the Admin → Overview Updates card and a
row in the update details panel. `libreportal verify` exposes the same check on
the CLI.

Honest framing: this is a self-check (run by the software it verifies), so red
fires only for genuine modified/tampered states; the badge tooltip points to
out-of-band `minisign -Vm` for an independent guarantee.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-28 19:41:22 +01:00
librelad
49cf7e8bec ux(system): move Reclaim button top-right, make it actually free space
Three fixes from testing the storage page:

- Placement: the "Reclaim space" button moves into the page header,
  top-right (matching the metric page), instead of sitting in the body.

- It now actually reclaims: build cache needs -a to drop (docker reports
  0 B "reclaimable" without it, but it's pure cache — safe to clear), so
  the CLI uses `docker builder prune -af`. Previously the safe scope
  freed ~nothing on a box whose reclaimable was mostly cache.

- Honest "Reclaimable" number: /api/system/storage was counting the
  whole build cache AND unused tagged images, overstating what the safe
  prune frees (e.g. 340 MB shown, ~96 MB per docker, button cleared 0).
  Reclaimable now = dangling images + build cache only; stopped
  containers and volumes are never counted (the safe prune never touches
  them). Headline now matches the button's effect.

Also simplify the CLI output (drop the jargony scope notice and the
reclaimed-total greps) and re-enable the now-persistent header button
after the post-reclaim refreshes.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-28 19:06:02 +01:00
librelad
3031c6cab9 feat(system): "Reclaim space" action on the Storage page
Adds a `libreportal system reclaim` CLI command and an orange "Reclaim
space" button on /admin/config/system/storage (the v2 prune control the
page always hinted at).

Scope is deliberately SAFE: build cache + dangling (untagged) images
only (docker builder prune -f + docker image prune -f via the
rootless-aware runFileOp). It never touches volumes (app data) or
tagged/in-use images, so nothing an app relies on is removed.

Wiring mirrors system_update: a systemReclaim() action + system_reclaim
route case run the command verbatim through the task processor. The
button confirms via showConfirmation, shows a spinner, and re-reads
storage usage as the prune lands. Button styled with --status-warning to
match the Reclaimable stat it sits under, with a note clarifying scope.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-28 18:50:27 +01:00
librelad
3f582120ba feat(cli): route all long-running app + update commands through tasks
Extends the install-routing spike (e5273a4) to every long-running CLI
command, so CLI and WebUI now share one execution path everywhere:

  app install      ← already done
  app uninstall
  app start / stop / restart / up / down / reload
  app backup
  app restore
  update apply
  backup app create   (matches `app backup` — same end target)

Each handler now has the same shape:
  if [[ "$LIBREPORTAL_TASK_EXEC" == "1" ]]; then
      <inline call>            # processor's recursive invocation
  else
      cliTaskRun "<cmd>" <type> <app>   # user invocation: enqueue + follow
  fi

Processor change — crontab_task_processor.sh:
  Adds `export LIBREPORTAL_TASK_EXEC=1` next to LIBREPORTAL_NONINTERACTIVE.
  Universal bypass: every task command the processor runs (CLI-queued OR
  pre-existing WebUI-queued like `libreportal app install adguard`)
  inherits the env var, so the inline branch fires and we never
  re-enqueue. This also lets us drop the env-var prefix the install spike
  was baking into the command string (e5273a4) — cleaner task files +
  one place to think about the bypass.

`backup app schedule` (the cron-driven path that already enqueues via
createTaskFile in backup_app_schedule.sh) is left alone — different
entry point, different runtime context, already correctly task-routed.

Why route the fast ones too (start/stop/restart/up/down):
  Consistency beats the ~1s task-roundtrip latency for a CLI button.
  Locking now serialises a CLI `app stop foo` against a WebUI restart of
  the same app; the audit trail covers every state change. Cheap to
  revert any individually if the latency turns out to bother someone.

Validated live earlier with `libreportal app install dashy` — task file
written, processor dispatched, follower streamed install live, exit 0
propagated. Same machinery now powers the other 9 handlers.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-27 14:38:14 +01:00
librelad
e5273a482d feat(cli): route app install through the task processor + live follower
Spike — closes the gap where the CLI install bypassed the very task system
the WebUI uses. Now both surfaces hit the same path:

  user types `libreportal app install dashy`
    → CLI enqueues a task file in $TASK_DIR (identical shape to the
      WebUI's createTaskFile)
    → pokes $TASK_DIR/.queue.fifo so the processor dispatches in <100ms
      instead of waiting up to IDLE_POLL_SECS
    → CLI tails the task log + polls .status, exits with the task's
      exit_code on terminal state
    → Ctrl-C detaches the follower without killing the task — the
      WebUI's tasks panel keeps showing it

Bypass: the recursive command in the task file is prefixed
`LIBREPORTAL_TASK_EXEC=1 libreportal app install <name>`. The install
branch in cli_app_commands.sh honours that env var by running inline,
which is what the processor's eval invocation hits. No processor
changes — the bypass travels with the task.

Wins:
  - one log file per install, shared by CLI + WebUI (audit trail + replay)
  - locking serialises CLI + WebUI installs (no more two-frontend race)
  - WebUI's "current task" indicator now reflects CLI work too
  - free `--detach` for fire-and-forget queueing

New: scripts/cli/task/cli_task_run.sh
  cliTaskRun <cmd> [type] [app] [--detach]
    Enqueues + follows; --detach prints the task id and exits 0.
  cliTaskFollow <task_id>
    `tail -F` the log + jq-poll the status; returns the task's exit_code.
    Designed to be reused for `libreportal task log <id>` reattach later.

Trade-off: ~200-500ms latency before the first byte (write task file,
processor wakes, opens log, follower starts tailing). Negligible for
install/update/backup — fast commands (list/status/config get) still
run inline. The current branch only changes `app install`; uninstall +
update + backup can be moved on the same pattern once this lands clean.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-27 14:29:30 +01:00
librelad
52e0227bb6 chore(cleanup): retire appGenerate — dead-on-arrival app-skeleton wizard
`libreportal app generate <name>` (and the menu's "g. Generate App" entry)
was broken three independent ways and incompatible with the per-app
architecture the project actually uses now:

  1. Copies from $install_containers_dir/template/ which doesn't exist —
     the only template/ in the tree was in scripts/unused/OLD_CONTAINERS/
     and was never installed into the live tree. cp -r would just fail.

  2. Every sed call used BSD/macOS syntax `sed -i '' -e …`. On Linux
     (every distro this targets) the empty '' becomes a positional file
     argument, so the substitutions never ran. 8 calls, all broken.

  3. Even if it had run, the produced skeleton would have been a
     pre-modular-tools / pre-per-port-subdomain app shape: no tools/,
     no scripts/ subdir, HOST_NAME=test in the .config. Every active
     containers/<app>/ today carries the modular layout the rest of the
     framework expects.

Plus the recent cleanups (the prompt loop fix in 9ffc8e4, the per-port
subdomain refactor in 2e4f420) had been peeling pieces off it without
the root question — does the function still belong? — getting asked.

Delete the whole surface:
  - scripts/app/app_generate.sh (157 lines, the function body)
  - scripts/unused/OLD_CONTAINERS/template/ (the never-installed source
    files appGenerate would have copied — stale enough to still carry
    HOST_NAME=test, CFG_<X>_HOST_NAME, and 248 lines of compose template)
  - menu entry "g. Generate App" + its dispatch in menu_main.sh
  - "generate" case branch in cli_app_commands.sh
  - `libreportal app generate` line in cli_app_header.sh
  - The corresponding entries auto-drop from files_app.sh +
    function_manifest.sh via regen.

New apps are added the way the catalog already grew — by hand-crafting
containers/<app>/{<app>.sh, <app>.config, docker-compose.yml,
tools/<app>.tools.json, scripts/<app>_*.sh}. Copying an existing app's
folder + renaming is the closest thing to a "generator" and it's a one-
command operation.

Net: -556 lines, no behaviour lost (the function never worked).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 23:48:35 +01:00
librelad
a4d3b78cdb feat(debug): LP_LOAD_TRACE + 'libreportal debug load-trace' (lazy-load Phase 1)
First step toward an autoload-style lazy loader for the 499-file source
tree (current cold load ~1s wall / 340ms user-time per CLI invocation,
mostly spent sourcing files the command never calls). This commit only
measures — no behaviour change unless LP_LOAD_TRACE=1.

LP_LOAD_TRACE=1 instrumentation (scripts/source/loading/initilize_files.sh):
  Wraps each  in the main file-list loop with EPOCHREALTIME
  before/after, writes `<elapsed_ms>\t<file_relpath>` to
  $LP_LOAD_TRACE_FILE (default /tmp/libreportal-load-trace.<pid>.log).
  Zero overhead when the env var is unset (one [[ test per file).

libreportal debug load-trace [cmd...]:
  New `debug` CLI category. Spawns a child `libreportal <args>` (default
  'help') with LP_LOAD_TRACE=1, then awk-aggregates the trace: wall vs
  cumulative source time, file count, top-15 hottest files. The diff
  between wall and cumulative-source = bash startup + dispatch + the
  command's own work.

Used in the next phases to (a) validate that the lazy loader actually
delivers the speedup we expect and (b) flag any single file that hogs
disproportionate time (rare `heredoc | sed | base64` style work at
source time would show up here as a >10ms entry).

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 20:33:22 +01:00
librelad
3fe2c0660a feat(peers): direct peer SSH — pairing + peer-shell + pull (Phase 3)
End-to-end direct-ssh-direct: two LibrePortal instances exchange pairing
tokens, each authorizes the other to call a locked-down peer-shell dispatcher
via SSH forced-command, then either side can pull live app data from the
other without needing a shared backup repo.

Push and Connect-via-relay are deferred — push is symmetric to pull (same
forced-command, opposite verb), and the relay variant waits for Connect to
actually exist (config_json + kind enum already future-proofed in Phase 2).

Key generation (peer_key.sh):
  One ed25519 keypair per install at ~<manager>/.ssh/libreportal-peer{,.pub}.
  Generated lazily on the first peer-related call. Used as our outbound
  SSH identity AND as the pubkey other instances authorize.

Forced-command dispatcher (peer_shell.sh):
  Standalone script, deployed by peerInstallShell() to
  ~<manager>/.local/bin/peer-shell. authorized_keys entries look like:
    command="~/.local/bin/peer-shell <peer-name>",no-pty,no-port-forwarding,
    no-X11-forwarding,no-agent-forwarding,no-user-rc ssh-ed25519 AAAA… peer:<name>
  sshd hands us $SSH_ORIGINAL_COMMAND; we parse, whitelist the verb, and
  refuse anything else. Verbs:
    ping        Liveness probe (JSON ok:true).
    list-apps   JSON {peer, apps:[{slug, size_kb}]}.
    stream-app  tar of containers_dir/<slug> to stdout (slug strictly
                validated — lowercase alnum+dash; rejects path traversal).
  Audit log appended to ~/.local/state/libreportal/peer-shell.log. Excluded
  from the generated source arrays (would crash any sourcing shell on empty
  SSH_ORIGINAL_COMMAND); generate_arrays.sh skip-list extended.

Pairing token (peer_pairing.sh):
  Format: lp-peer|v1|<name>|<user>|<host>|<port>|<base64-pubkey>|<fingerprint>
  Pipe-delimited because the SHA256 fingerprint and base64 pubkey both
  contain ':'. peerPairingParse decodes + re-derives the fingerprint from
  the actual key, refusing tokens with mismatched fingerprints (catches
  truncation / tampering). peerPairingAccept:
    1. Installs peer-shell (peerInstallShell).
    2. Appends to authorized_keys with the lockdown options above.
    3. Inserts a peers row (kind=direct-ssh-direct, config carries host,
       port, user, fingerprint).
  Symmetric — user runs accept on BOTH sides with the other's token to
  enable bidirectional calls.

Outbound SSH (peer_remote.sh):
  peerExec <name> <verb> [args] — looks up the peer's connection config and
  ssh's in with the right key, BatchMode + ConnectTimeout + accept-new for
  the host key. peerPing wraps it and updates peers.status + last_seen.

Pull-an-app (peer_pull.sh):
  peerPullApp <peer> <app> [--no-pre-backup] [--keep-urls]
    1. peerPing (refuse if unreachable).
    2. migratePreBackupDestination (reuses the Phase 0 safety wrapper —
       same restic-tagged pre-migrate snapshot as the backup-channel flow).
    3. Stop + wipe destination's app folder.
    4. peerExec stream-app | tar -x (pipefail; bails on partial transfers).
    5. migrateApplyUrlRewrite + dockerComposeUpdateAndStartApp install
       (URL repointing, idempotent install path).
    6. dockerComposeUp + post-restore hooks.
  Identical Stage-2..6 to migrateApplyApp; only the data source differs
  (tar-over-SSH instead of restic-restore).

CLI (cli_peer_commands.sh + header):
  libreportal peer token                — emit this host's pairing token
  libreportal peer pair <token> [name]  — accept a token (override name)
  libreportal peer apps <peer>          — live peer-shell list-apps
  libreportal peer pull <peer> <app> [--no-pre-backup] [--keep-urls]

WebUI (/peers):
  Header gains 'Show my token' and 'Pair with token' buttons (both open
  modals around the matching CLI verbs). Token modal warns the user that
  the token is credentials. Pair modal accepts a free-form override name.
  Direct-SSH peer cards gain a 'List apps' button that opens an inline
  drawer showing the peer's live app inventory (via peer apps) with per-
  app 'Pull' buttons. Pull modal has the same two safety toggles as the
  Migrate tab (pre-backup ON, URL rewrite ON by default).
  Backup-channel manual-add modal kept; direct-SSH must use the token flow.

Smoke-tested:
  - All 16 peer-subsystem functions register without crashing the shell.
  - peer-shell ping ⇒ {ok:true}; unknown-verb refused; path-traversal slug
    refused; valid-slug streams.
  - Token emit→parse round-trip preserves every field; garbage rejected
    with not-a-token; v99 rejected with unsupported-version.
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 17:56:57 +01:00
librelad
1014dd6e42 feat(peers): introduce 'Peer' as a first-class concept (Phase 2)
A peer is a named reference to another LibrePortal instance. Phase 2 only
implements kind=backup-channel (friendly label over a hostname that shows
up in a shared backup repo); direct-ssh-direct and direct-ssh-via-relay
(Connect's blind-relay) are reserved enum values for Phase 3.

DB schema (db_create_tables.sh):
  CREATE TABLE peers (
    id           INTEGER PRIMARY KEY AUTOINCREMENT,
    name         TEXT UNIQUE NOT NULL,
    kind         TEXT NOT NULL DEFAULT 'backup-channel',
    config_json  TEXT NOT NULL DEFAULT '{}',
    status       TEXT DEFAULT 'unknown',
    last_seen    TEXT,
    created_at   TEXT DEFAULT CURRENT_TIMESTAMP
  );
  + indexes on name and kind.

  config_json is kind-specific so new transports don't need a schema
  migration. For backup-channel it carries {"hostname":"","loc_idx":N}.

Bash module (scripts/peer/):
  peer_helpers.sh   _peerDb, peerSqlEscape, peerValidateName/Kind.
  peer_add.sh       peerAdd <name> <kind> [k=v ...] → INSERT, refresh
                    generator. Rejects unimplemented kinds early so users
                    don't create dead-end peer records.
  peer_remove.sh    peerRemove <name> → DELETE.
  peer_list.sh      peerList → JSON array; peerGet, peerNameForHostname
                    (reverse-lookup for the migrate-tab overlay).
  peer_check.sh     peerCheckReachable, peerCheckAll. For backup-channel
                    'reachable' = at least one snapshot from that hostname
                    visible in (preferred|any enabled) location. Updates
                    status + last_seen so UI dots render without re-probing.

CLI (scripts/cli/commands/peer/):
  libreportal peer list
  libreportal peer get <name>
  libreportal peer add <name> backup-channel hostname=<host> [loc_idx=<n>]
  libreportal peer remove <name>
  libreportal peer check [name]

  Auto-routed by cli_initialize.sh's category-discovery.

WebUI data generator (scripts/webui/data/generators/peers/webui_peers.sh):
  Emits data/peers/generated/peers.json with the peerList output and a
  generated_at envelope. Hooked into webuiLibrePortalUpdate alongside the
  backup generators.

Frontend:
  - New top-level /peers route in spa.js (PeersPage class, peers-content.html).
  - 'Peers' nav item in the topbar between Backups and the right-side controls.
  - Add-peer modal with friendly-name + kind + hostname + preferred-location
    selector (populated from the existing backup-locations data).
  - Per-peer card with status dot, last-checked time, Check + Remove buttons.
  - Phase 3 kinds appear in the kind dropdown as disabled options so users
    can see what's coming.

Source-array wiring:
  - generate_arrays.sh auto-created files_peer.sh from the new peer/ dir.
  - cli_files.sh + app_files.sh include ${peer_scripts[@]} alphabetically.
  - files_webui.sh auto-picked-up the new peers/ generator subfolder.

The migrate-tab friendly-name overlay (use peer names in /backup/migrate
when a peer record exists for a hostname) is intentionally deferred — it's
a 5-line frontend lookup once peers.json is loaded; cleaner to add after
Phase 3 ships its peer-detail view.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 17:43:56 +01:00
librelad
32b2840d73 refactor(migrate)!: rewrite kernel — discover/preflight/apply with JSON progress
Phase 0 of the migration-system refresh. Replaces the 77-line
scripts/migrate/ with a properly-shaped kernel that Phase 1 (WebUI) and
Phase 3 (direct peer SSH) can both build on.

New module layout (6 files):
  migrate_progress.sh   — migrateEmit JSON-per-line helper; opt-in via
                          MIGRATE_JSON_PROGRESS=1, writes to fd 3 if open
                          (clean WebUI streaming channel) else stdout.
  migrate_discover.sh   — migrateDiscoverHosts / migrateDiscoverApps /
                          migrateDiscoverAppDetail (JSON {snapshots, latest_*}).
                          Old migrateDiscoverAppsForHost kept as back-compat.
  migrate_preflight.sh  — migratePreflight emits one JSON object with
                          snapshot{id,date}, destination{installed,running,
                          disk_free_kb}, collision{occurs,default_action,
                          pre_backup_default}, url_rewrite{default_action,
                          per_app_opt_out}, warnings[], errors[].
                          Exit 0 on usable preflight, 1 on hard error.
  migrate_url_rewrite.sh— Host-bound CFG_<APP>_* fields (URL/HOST/DOMAIN/
                          DOMAIN_PREFIX/HOSTNAME/PUBLIC_URL) get rewritten
                          from the destination's install-template after
                          restore — so a moved app stops claiming the
                          source's hostnames. Per-app opt-out via
                          CFG_<APP>_MIGRATE_URL_REWRITE=false. All other
                          fields (DB passwords, API keys, prefs) carry
                          over from the source unchanged.
  migrate_pre_backup.sh — migratePreBackupDestination takes a snapshot of
                          the destination's existing <app> (tagged
                          pre-migrate=<UTC timestamp>) before the wipe.
                          Default ON; opt-out with --no-pre-backup. Safety
                          net for the always-replace collision policy.
  migrate_apply.sh      — migrateApplyApp / migrateApplySystem. Parses
                          --no-pre-backup / --keep-urls / --json-progress
                          opts, runs preflight → pre-backup → restoreAppStart
                          (existing flow) → URL rewrite → re-deploy compose.
                          migrateApp / migrateSystem kept as shims so the
                          old CLI surface still works.

CLI dispatcher (cli_restore_commands.sh + cli_restore_header.sh):
  Existing 'restore migrate app/system/discover' calls all still work.
  New verbs:
    restore migrate list <host> [loc_idx]
    restore migrate preflight <host> <app> [loc_idx]   ← JSON, for the WebUI

Design choices baked in (per the spec):
  - Always-replace collision (no multi-install of an app), safety net is the
    on-by-default pre-migrate backup.
  - URL rewrite by host-bound suffix list, not per-field allowlist — works
    out-of-the-box for new apps without extra config.
  - migrateEmit fd-3 contract is what Phase 1's WebUI will stream; falls
    back to stdout in interactive CLI so dev/debug just works.
  - Transport-agnostic: nothing in this kernel knows whether the backup
    location is local/SSH/S3/Connect — engineSnapshotsJson + engineBackupApp
    do that, so Connect (the future blind-relay) plugs in as 'just another
    location kind' with zero kernel changes.

Smoke-tested: all 13 public functions register; JSON emit produces correct
escaping (quoted strings vs bare numerics) and respects MIGRATE_JSON_PROGRESS.

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 17:22:54 +01:00
librelad
839cf3561a feat(cli): backup system / restore system subcommands
Expose the system-config backup on demand (not just within 'backup all'):

- `libreportal backup system`      -> backupSystemConfig (snapshot the system
  config — settings, WebUI creds, backup-location creds — to all enabled locations)
- `libreportal restore system [loc_idx]` -> backupRestoreSystemConfig (restore the
  latest system snapshot into a staging dir; never overwrites live config)

Distinct from the existing 'restore migrate system' (which restores all *apps*
from another host). Help text updated for both. Routing verified with stubs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 00:27:25 +01:00
librelad
899e04bcd3 feat(regen): unified regeneration front door + self-heal poll
Add `lpRegen` (scripts/webui/webui_regen.sh) — one entry point that rebuilds the
file-derived artifacts whose sources changed, so callers don't have to know which
generator owns what. Self-heal is a cheap `find -newer` mtime compare (no watcher
/ daemon): a stage runs only when a source is newer than its artifact, or --force.

- `libreportal regen [all|webui|arrays] [--force]` CLI command (new category).
- Task processor idle tick runs a throttled `regen webui` poll, so an app dropped
  in out-of-band (drag-drop / marketplace) appears on its own — no manual command,
  no inotify (works on the relocatable/external-drive roots where inotify can't).
- make_release.sh guards against shipping stale source arrays (regenerate; abort
  if the committed tree was out of date), killing the "forgot generate_arrays" bug
  class at the build boundary.
- Document the front door in DEVELOPMENT.md.

webui scope rebuilds from containers/<app>/{*.config,tools/*.tools.json}; arrays
scope from scripts/** (a dev/build concern — a no-op on a normal install). Gate
logic verified in a sandbox (clean/config-newer/tools-newer/force/missing).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-25 23:20:02 +01:00
librelad
8b14f26125 refactor(desudo): route scattered runtime sudo through privilege helpers
Convert the remaining ad-hoc 'sudo' calls across the data plane to the
run_privileged helpers so every file op lands as the correct owner with
no blanket root:

- DB/configs (manager-owned): db_list_all_apps, delete_db_file,
  install_sqlite, cli_webui_commands -> runInstallOp
- containers (dockerinstall-owned): scan_container_socket, delete_data,
  webui_task_files, webui_app_log, webui_config_patch,
  application_missing_variables, uninstall_app -> runFileOp/runFileWrite
- genuine root: passwd, tailscale, ufw-docker, sysctl grep, systemd
  unit read, authorized_keys read, nobody chown -> runSystem
- interactive editors and 'id -u': drop sudo entirely (run as caller)
- owncloud/adguard container-UID config edits -> runSystem (funnel;
  docker-exec rework deferred)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-24 18:00:19 +01:00
librelad
3a679d7343 feat(ssh): admin host SSH-access engine (backend + CLI + snapshot)
Fresh, on-demand inbound SSH-access management for the host (replaces the old
maze). scripts/ssh/host_access.sh manages the install user's authorized_keys —
add a pasted public key (validated), list, remove — and toggles sshd password
login behind a lockout guard (won't disable passwords with no key; won't drop
the last key while passwords are off; sshd -t before reload, with backup).

New 'ssh' CLI category (status/key-add/key-remove/password-auth/generate) and
a webuiGenerateSshAccess snapshot (data/ssh/access.json: user, password_auth,
authorized keys as type+fingerprint+comment — public only) wired into the
regen chain. Nothing runs automatically; only explicit admin actions change
anything. WebUI page next.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-23 16:40:59 +01:00
librelad
7b32dc2e29 fix(backup): clean snapshot-id capture + accept --latest on restore
Found while testing live backups end-to-end:

- Engine backup adapters logged to stdout, so the caller's $() snapshot-id
  capture was polluted with log text — verify-after-backup then failed with
  'no matching ID' on every run. Route their log lines to stderr so stdout is
  only the id (restic/borg/kopia).
- 'libreportal app restore <app> --latest' (as the help advertises) and the
  bare 'restore <app>' both failed: --latest was passed to restic verbatim and
  unset args arrive as the literal 'empty'. Normalise both to 'latest'.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-23 16:39:56 +01:00
librelad
19c76f0a3f feat(backup): CLI + data plumbing for per-location SSH keys
Expose the existing location_ssh.sh key store through the backup CLI:
'backup location ssh-key-set|ssh-key-generate|ssh-key-public|ssh-key-delete <idx>'
(the WebUI runs these as tasks). The locations generator now emits
ssh_key_exists + ssh_public_key (public key only — the private key never
leaves the per-location ssh.key file), so the editor can show the key state.
Also fix the stale SSH_AUTH label (~/.ssh/id_rsa -> managed per-location key).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-23 16:11:31 +01:00
librelad
d6e7df8ada refactor(backup): move location field schema to a generated JSON
The per-type field map lived hardcoded in backup-page.js. Add a
webuiGenerateBackupSchema generator that emits the type -> ordered field list
to data/backup/generated/schema.json (wired into the backup regen chain and
the CLI 'webui generate backup'). The editor fetches it into this.locSchema
and reads it via locFieldsForType; BACKUP_LOC_FIELDS_BY_TYPE stays only as a
fallback if the fetch fails.

Keeps the data-in-generators pattern consistent — the schema now has one
backend source of truth. The dynamic show/hide behaviors (SSH auth, path
mode, engine filtering) remain frontend logic by nature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-23 15:22:53 +01:00
librelad
4ce0340ef8 refactor(backup): replace per-app cron stagger with task-queue scheduler
Application backups were driven by one crontab entry per app, each offset by
id * CFG_BACKUP_CRONTAB_APP_INTERVAL minutes. That minute offset is written
straight into cron's 0-59 minute field, so past ~20 apps it overflowed into
an invalid entry that silently never fired, and the fixed spacing could not
serialize backups that ran longer than the gap.

Replace it with a single daily entry (`libreportal backup scheduled`) that
enqueues a backup task per enabled app. The existing systemd task processor
drains them serially — no minute overflow, real serialization, and backups
are now visible/cancellable in the Tasks UI. Per-app enable is read from
CFG_<APP>_BACKUP at schedule time instead of being mirrored into crontab.

Removes the stagger machinery (timing/setup/check/remove scripts), the
now-unused cron_jobs table + insert, and the CFG_BACKUP_CRONTAB_APP_INTERVAL
config knob and its WebUI field.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-22 14:34:35 +01:00
librelad
d5fe1bc56b feat(webui): out-of-date detection + one-click update
Surface when LibrePortal is behind upstream and let users update from the
WebUI, reusing the proven git-update path instead of reinventing it.

Detection (host): webuiSystemUpdateCheck writes
frontend/data/system/update_status.json from a throttled git fetch +
behind-count + VERSION compare, off the existing per-minute
`webui generate system` cron. A new /VERSION file is the canonical version.

Display (frontend): update-notifier.js/.css render a global topbar badge
(every page) and a dashboard banner (prominent when behind, subtle "up to
date" with a manual check otherwise), plus a details panel.

Actions go through the task pipeline:
- `libreportal update apply` -> webuiRunUpdate (non-interactive: guards,
  forced check, gitPerformUpdate, then dockerInstallApp libreportal)
- `libreportal update check` -> forced recheck

gitFolderResetAndBackup's body is extracted into gitPerformUpdate (no exit)
so the WebUI path can reuse it; the interactive CLI flow is unchanged.

Detection JSON verified against the repo (up-to-date and behind cases).
webuiRunUpdate's re-clone + redeploy still needs validation on a live host.

The latest-version source is git for now and is the single swap point for
get.libreportal.org later — the JSON contract and frontend stay unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-21 23:33:43 +01:00
librelad
875a60f90f LibrePortal v0.1.0 — initial release
A free, open, self-hosted app platform (GNU AGPLv3): one-click app deploys,
Traefik reverse proxy with automatic SSL, rootless Docker support, gluetun
VPN routing, and a web dashboard to manage it all.

Free & open forever to self-host; optional paid hosted services fund it.
See PROMISE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-21 20:37:54 +01:00