LibrePortal/docs/roadmap/updates-and-distribution.md
librelad caee74bd76 feat(distribution): signed artifact-index fetch+verify primitive (Phase 1)
Build the read side of the unified distribution primitive from
docs/roadmap/updates-and-distribution.md: one team-signed catalog
(index.json) on the same channel as latest.json, listing type-tagged
artifact envelopes. A hotfix is the first artifact type; apps/themes/
components are future envelope rows through the SAME pipe — the
marketplace seam is just the `type` + `payload.kind` fields.

Phase 1 is fetch + verify + parse only (NO mutation; the snapshot →
ops → rollback → History apply verb is Phase 2):

- Factor `lpVerifyMinisig` out of `lpFetchRelease` (scripts/source/
  fetch.sh) — one trust anchor (the root-owned footprint key) now
  shared by releases and the index; refactor `lpFetchRelease` to use
  it (behaviour-preserving, still fail-closed).
- scripts/source/artifacts.sh: `lpFetchIndex` — download →
  verify-before-parse → `valid_until` freshness (anti-withholding) →
  `index_serial` monotonic high-water (anti-rollback, TUF-lite) → emit
  verified JSON. Trust core is jq-free; parsing accessors prefer jq
  with a grep fallback.
- `libreportal artifact index` (scripts/cli/commands/artifact/) —
  read-only front door that fetches, verifies and lists. Runs directly
  like `updater check` (no task; no mutation).
- Regenerate the source arrays + lazy-load function manifest for the
  new files.

Doc: promote the format from vision to spec (§8) — 3 layers
(INDEX/ENVELOPE/PIPELINE), the bounded declarative op vocabulary (no
run-script, ever), the apply pipeline mapped onto existing functions,
the marketplace seam, and resolutions for all five open forks.

Self-tested 12/12: trust core fails closed (real key + no minisign →
refuse), happy path, stale-refused, rollback-refused, signature-refused,
jq + grep parsing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-31 16:48:06 +01:00

390 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# LibrePortal — Updates, Improvements & Distribution (Roadmap / Vision)
**Status:** §0§7 are the brainstorm (vision). **§8 is the committed format spec** and the open forks (§6) are resolved there. · **Audience:** us, future-self · **Scope:** the updater feature, "hotfixes", and how third-party themes/apps/components get distributed · **Origin:** brainstorm 2026-05-30/31 → format decided & Phase 1 built 2026-05-31
> Sections 07 below are the original thinking doc — kept verbatim so the
> reasoning isn't lost. **The conclusion of that brainstorm is §8: the concrete
> artifact format**, designed so apps/themes/components slot into the same pipe a
> hotfix uses. Phase 1 of it (the signed-fetch+verify read primitive) is already
> built — see §8.7. The forks in §6 are no longer open; §8.5 records how each was
> resolved.
---
## 0. The one idea everything hangs off
The cohesion worry that started this: the updater feels like a **bolt-on**. The fix
isn't to hide it — it's to notice that hotfixes, app updates, themes, and components
are all the *same verb*:
> **LibrePortal pulls a signed, declarative thing from a source, verifies it, and
> applies it reversibly (snapshot → apply → rollback).**
Build that **one distribution primitive** once, and hotfixes / app-installs / themes /
components become three *payloads* through one pipe — not three separate features.
That single primitive is the spine of this whole doc.
It rides machinery that already exists:
- **Mutations via tasks** — every apply is a `libreportal …` task, never a new mutating API.
- **Scan-and-manifest** — a thing is "installed" by dropping a folder; the scan discovers it.
- **Recovery** — the updater already snapshots-before-update and can roll back. Everything inherits that safety net for free. *This is what makes bold defaults defensible.*
- **minisign** — release signing infra already exists; reuse it as the trust anchor.
- **The existing update-check pipe** — already pings out for "is there a new version"; extend that *one signed manifest*, don't add a second phone-home.
---
## 1. Hotfixes
**What it is:** a small, signed, individually-reversible, **declarative** change the
LibrePortal team ships *out-of-band* (between releases), each with a plain-English
what + why, each independently toggleable.
**The killer use case — upstream breakage.** Self-hosters get burned independently when
an upstream image changes something (Vaultwarden renames an env var, Jellyfin moves a
data dir, an app's `latest` tag breaks on a Tuesday). A hotfix channel turns the team's
collective firefighting into a shipped product: *we notice, push a one-line reversible
fix, it lands on every install within hours.* No single self-hoster can replicate that.
**Content flavors:**
- **Upstream-breakage fixes** (the killer one)
- **Security hardening** (tighten a default header, disable a risky default)
- **Compatibility shims** (ARM, rootless, specific kernels)
- **Quality-of-life tweaks** ("cool tweaks we found useful")
**The supply-chain contract (non-negotiable for this project):** an on-by-default,
auto-fetched, auto-applied feed *is* a remote-code channel into every box. So:
- **Signed** — minisign, our key.
- **Declarative, not arbitrary scripts** — "set config key K", "add compose label L",
"patch file F *only if its checksum matches*". Bounded + auditable, not `run this .sh`.
- **Public + identical for everyone** — same transparency model as the warrant canary.
A publicly-logged feed makes a *targeted* hotfix to one victim impossible to send silently.
- **Rides the existing update-check pipe** — no new phone-home, no new metadata leak.
- **Nothing silent** — every applied hotfix lands in **History** with what / why / revert.
**On "enabled by default" (UNDECIDED — see open forks):** leaning toward splitting by
severity — *security/breakage* auto-applies (rollback has your back); *tweaks/QoL* are
surfaced with one-click apply, or auto only if the user opted into "auto-improve."
**Why on-by-default is even defensible:** because Recovery already exists — every hotfix
is reversible through the same task → snapshot → apply path. The safety net unlocks the
bold default.
`TODO` (when prioritized):
- [x] Define the declarative hotfix schema (the allowed operations + checksum preconditions). → **§8.2**
- [x] Decide auto-apply policy (uniform vs severity-split). → **§8.5 fork 2** (severity-split)
- [x] Fetch + verify the signed manifest on the same channel as the version check. → **§8.7 Phase 1 (built)**
- [ ] Apply pipeline for the ops (snapshot → apply → verify → rollback → History). → §8.7 Phase 2
- [ ] Surface applied/available hotfixes as a stream in the updater + History audit trail. → §8.7 Phase 3
---
## 2. Reframe the updater → "Updates & Improvements"
The updater's identity is currently fuzzy ("a list of app versions" — which honestly
*could* just be a tab on the app page, which is why it reads as bolted-on). Hotfixes give
it a reason to be its own thing. Rename the concept from **"App Updater"** to
**"Updates & Improvements"** — the single front door for *everything that changes your
install from the outside*:
- **App updates** (version bumps)
- **Security** (CVEs — the urgent stuff)
- **Hotfixes** (curated small improvements — §1)
- **Recovery** (the safety net that makes all of it safe to apply)
- **History** (audit trail of everything applied)
That earns the standalone link and answers the earlier "should this fold into Admin / be
a tab on apps?" question: it stays its own section *because* it's now the curated-improvement
channel, not just a version list. (Existing tabs already are Overview / Updates / Security /
Recovery / History — this is mostly a framing + the hotfix stream, not a rebuild.)
`TODO`:
- [ ] Decide on the rename / framing in the UI.
- [ ] Add the Hotfixes stream as a tab or a section within Overview.
---
## 3. Distribution: a **registry**, not a **marketplace**
For getting third-party **apps / components / themes** onto a box: do **not** build an
upload platform (the Google-Play / Nextcloud-store / npm shape = hosting + accounts +
moderation + liability for code running near-root on people's boxes). That's the
worst-fitting shape for a privacy/no-managed-hosting/blind-relay project.
**Want Nextcloud's *UX* (in-app browse + one-click install) on F-Droid's *backend*** (a
signed, git-published index of recipes pointing at authors' own repos; contribution = a PR
to the index repo; you host a static signed JSON, not an upload server). Power users can
add a **custom source URL** (a "tap"), so the ecosystem is open without you being the host
or gatekeeper.
### 3.1 Why our apps aren't Nextcloud's apps (the key insight)
A **Nextcloud app** is a PHP plugin running *inside* the Nextcloud process — it can do
anything, which is why Nextcloud needs a code-signing CA + review. A **LibrePortal app**
is a *whole separate container we orchestrate* (upstream's image, from upstream's
registry). What a user "adds" is a **definition** (image, ports, config keys, routing) —
*wiring*, not in-process code. That's a much smaller, more declarative trust surface.
Lean into it.
### 3.2 The one real danger to design around
A LibrePortal app definition can ship host-side `tools/*.sh` hooks that run via the task
system. The compose/config is declarative + safe-ish; **the hook scripts are the
arbitrary-code part** (our equivalent of Nextcloud's in-process PHP). So tier trust around
*that*:
| Tier | Signed by | Host scripts | UI |
|---|---|---|---|
| **Official** | LibrePortal team key | allowed (reviewed) | green check |
| **Community** | author key | disallowed / sandboxed / **shown for review before install** | yellow "community — review the source", extra confirm |
| **Custom source** | author key / unsigned | advanced | "you're on your own" framing |
### 3.3 Install flow (all existing machinery)
Browse catalog → click **Add** → WebUI dispatches a task (`libreportal app add <signed-source>`)
→ fetch definition, verify signature/checksum, drop into `containers/<app>/`, run scan/regen,
app appears. Snapshot-before + reversible uninstall via Recovery. No new mutating API.
`TODO`:
- [ ] Build the signed-fetch + reversible-install primitive (§0) — hotfixes need it too.
- [ ] Surface first-party app definitions as a browsable "Browse & Add" catalog in the App Center.
- [ ] Define the trust tiers + how host scripts are gated for community sources.
- [ ] (later) The signed git index format + "add custom source" UX.
- [ ] (later) Theme gallery on the same index (lowest risk, but still signed — CSS can exfil via `background-image`).
---
## 4. Sequencing — don't build the storefront before there are goods
You have one theme set, a handful of first-party apps, and zero community contributions
today. A registry with nothing in it is pure overhead. So:
1. **First-party catalog UX now** — surface our own app definitions as browse-and-add.
Useful day one with no third parties; first-party apps *are* the seed catalog.
2. **The signed-fetch + reversible-install primitive** underneath (hotfixes need it anyway).
3. **Open to a community index** only once there's real demand. The index is a one-file
signed artifact you add the day the first good community app/theme exists — not a platform.
Same staging applies to hotfixes (first-party only, always) and themes.
---
## 5. Money / Connect note
A *paid* marketplace contradicts the decided Connect direction (blind relay, no managed
hosting; value = privacy relay + support stack). If money ever enters, "curated/supported
components *as part of Connect*" fits the model; "host a store and take a cut" does not.
Flag only — not on the table.
---
## 6. Open forks (RESOLVED — see §8.5)
> These were the genuinely-undecided questions. They are now decided; §8.5 holds
> the resolutions and the reasoning. Kept here for the record.
1. **Hotfix scope** — config/compose tweaks only, or can a hotfix patch app files / our own WebUI code too? (Sets the entire risk profile.)
2. **Auto-apply policy** — uniformly on-by-default, or split by severity (security auto, tweaks surface-and-suggest)?
3. **Hotfix locality** — per-app (also shows on the app's page) vs system-wide vs both?
4. **Third-party contribution — yet?** Or first-party-curated for the foreseeable future? If the latter, skip the index entirely and just build the signed-fetch primitive; "registry" is a v2 concern.
5. **App catalog entry point** — curated Browse-&-Add list, or bring-your-own-compose (add an arbitrary container) as the primary entry, or both?
---
## 7. Stuff we discussed but didn't capture here
*(Placeholder — there were more conclusions from the brainstorm that didn't make it in.
Add them as they resurface.)*
- [ ] _…_
---
# Part II — The format (committed spec)
## 8. The artifact format
This is the concrete shape the brainstorm landed on. It was stress-tested by a
four-lens design pass (marketplace-first, security-first, simplicity/reuse-first,
ops-ux-first) that **converged** on the same model — which is why it's promoted
from "vision" to "spec". The whole thing is **one verb over a type-tagged
envelope**; a hotfix is the first artifact type, and apps/themes/components are
*new envelope rows*, not new features.
### 8.0 Three layers (each already half-built)
| Layer | What it is | Reuses |
|---|---|---|
| **INDEX** | A static, team-signed JSON catalog at `$base/$channel/index.json` (+ `.minisig`), in the **same release tree as `latest.json`**. A list of artifact ENVELOPES. | `fetch.sh` downloaders, the footprint signing key, the existing update-check phone-home |
| **ENVELOPE** | One artifact entry. **Fixed** metadata for every type; the *only* type-specific part is `payload`, a tagged union keyed by `payload.kind`. | — (new, but tiny + frozen) |
| **PIPELINE** | The verb: fetch → verify(sha256+sig) → snapshot → apply → verify → auto-rollback → History. | `lpFetchRelease`/`lpVerifyMinisig`, `updaterApplyApp` (snapshot/rollback/History), the task system |
The envelope **never changes** as new types arrive. Only two fields carry the
type information: `type` and `payload.kind`. That is the whole marketplace seam.
### 8.1 The INDEX + ENVELOPE (example)
`get.libreportal.org/stable/index.json` (signed by `index.json.minisig`):
```jsonc
{
"schema": 1,
"index_serial": 17, // monotonic; anti-rollback (TUF-lite)
"valid_until": 1750000000, // epoch; a stale feed is REFUSED (anti-withholding)
"generated_at": "2026-05-31T12:00:00Z",
"artifacts": [
{
"id": "hf-vaultwarden-signup-env-2026-05", // stable, unique
"type": "hotfix", // hotfix | app | theme | component
"version": 1, // bump to re-issue/supersede
"publisher": { "name": "LibrePortal", "trust": "official" }, // official|community|custom
"severity": "breakage", // security|breakage|compat|tweak
"auto": true, // see §8.5 fork 2 (severity-split default)
"title": "Fix Vaultwarden signup after upstream env rename",
"why": "Upstream renamed SIGNUPS_ALLOWED; logins break until the new key is set.",
"applies_when": { // gates; missing = always
"app": "vaultwarden", "min_lp": "1.0.0", "max_lp": null,
"max_footprint": 4
},
"payload": {
"kind": "ops", // ops (hotfix) | bundle (app/theme/component)
"url": "stable/payloads/hf-vaultwarden-signup-env-2026-05.json",
"sha256": "…", "sig": "stable/payloads/hf-…json.minisig"
}
}
]
}
```
**Fixed fields, identical for every type:** `id, type, version, publisher{name,trust},
severity, auto, title, why, applies_when, payload{kind,url,sha256,sig}`. An app entry
is byte-for-byte this shape with `type:"app"`, `payload.kind:"bundle"`, and a tarball
payload. A theme is `type:"theme"`, `kind:"bundle"`. Nothing in the envelope moves.
**Forward-compat firewall:** an installed box that doesn't recognise a `type` or a
`payload.kind` **skips + logs** it (never errors). So the registry can publish new
types the day a newer client understands them, without breaking older installs.
### 8.2 The op vocabulary (`payload.kind:"ops"` — the hotfix body)
A **bounded, closed, declarative** set. **There is no `run-script` op, ever** — that
is the supply-chain contract from §1. The payload file is `{ "schema":1, "ops":[ … ] }`.
The applier is a hardcoded dispatch `case`; an **unknown op name aborts the whole
artifact** (fail-closed, never a partial apply). Every op:
1. is **precondition-guarded** (checksum / `expect_current`) — it refuses on local drift
rather than clobbering,
2. is **reversible** — reverse is the snapshot restore the pipeline already takes, so
even a buggy op can't make rollback wrong,
3. writes **only through the existing privilege funnels**`runInstallOp`/`runFileOp`
by tree (never raw `sudo`); `set-config-key` rides `updateConfigOption`, which already
routes the write correctly per the de-sudo split.
| op | args | apply (existing fn) | reverse | precondition |
|---|---|---|---|---|
| `set-config-key` | `key,value` | `updateConfigOption KEY VALUE` | restore snapshot | `key` matches `^CFG_[A-Z0-9_]+$`; opt. `expect_current` |
| `add-compose-label` / `remove-compose-label` | `app,service,label` | edit `containers/<app>/docker-compose.yml` via `runFileOp` | inverse op / snapshot | service exists |
| `set-compose-image` | `app,service,image` | rewrite the `image:` line | restore prior image | current image == `expect_current` |
| `ensure-env` | `app,service,key,value` | upsert env entry | restore / remove | — |
| `patch-file-if-checksum-matches` | `path,expect_sha256,content_ref` | write new content **iff** current sha256 matches | restore snapshot | **hard** sha256 match; path-allowlisted to `containers/<app>/` + `configs/` |
`set-compose-image` + `patch-file-if-checksum-matches` are the upstream-breakage killers
(§1). The checksum lock turns "patch a file" from an arbitrary write into a drift-safe,
conflict-detecting, reversible transform.
### 8.3 The PIPELINE (the verb) — `libreportal artifact apply <id>`
A generalization of `updaterApplyApp`, run **only as a task** (`cliTaskRun "libreportal
artifact apply <id>" "artifact_apply" "<app|->" ""`; the processor re-invokes with
`LIBREPORTAL_TASK_EXEC=1`). Seven steps — **six are type-agnostic; only step 4 dispatches
on `payload.kind`**:
0. **RESOLVE** (read-only) — `lpFetchIndex` (cached), find the envelope by id, check
`applies_when` + `lpVersionGt` + `max_footprint <= lpInstalledFootprintVersion`
(reuse `fetch.sh`'s exact footprint guard). Gate fails → History `skipped` + reason.
1. **FETCH**`_lpDownload "$base/$channel/$payload.url"`.
2. **VERIFY**`_lpSha256` == `payload.sha256`, then `lpVerifyMinisig` against the
per-artifact `payload.sig`. (Two-tier: footprint key signs the index; the index
pins each payload's hash + sig.)
3. **SNAPSHOT**`libreportal backup app <app>` (the Backup engine) — the reversibility
anchor that makes auto-apply defensible.
4. **APPLY**`kind:"ops"` → the §8.2 interpreter; `kind:"bundle"` → drop+scan/regen
(Phase 4). **Only this step knows the type.**
5. **VERIFY** — app healthy / container up (reuse the updater's post-check).
6. **AUTO-ROLLBACK on failure**`updaterRollbackApp <app> auto` (restore the snapshot).
7. **HISTORY**`updaterRecordHistory` (extended with `artifact_id`, `serial`) → the
existing History tab. **Nothing silent.**
### 8.4 The marketplace seam
**Unchanged forever** (built once, reused): the index file + location + bash-native
parser; the whole envelope shape; pipeline steps 03,57; the two-tier trust chain;
mutations-via-tasks; the `valid_until`/`index_serial` guarantees. Adding apps/themes/
components is **purely additive**:
- a new `type` value becomes "handled" in step 4's dispatch (old boxes skip+log — §8.1 firewall);
- those types use `payload.kind:"bundle"` (a signed tarball) + one new bundle handler;
- a **custom source ("tap")** is just a second `(base_url, pubkey)` pair appended to a
list — zero envelope change, the registry opens without us hosting or gatekeeping.
This is exactly the §3 "registry, not marketplace" shape, now expressed in the format.
### 8.5 Fork resolutions (was §6)
1. **Hotfix scope** → **config/compose ops + checksum-pinned file patches; NO code
execution.** `patch-file-if-checksum-matches` is allowlisted to `containers/<app>/` +
`configs/` and is drift-safe + reversible. **Our own install tree (WebUI/CLI code) is
off-limits to hotfixes** — it already has a signed, whole-tree-verified delivery channel
(releases + `SHA256SUMS` + `verify.sh`); letting a hotfix mutate it would open a second,
finer-grained code-injection surface that bypasses the whole-tree signature. Code fixes
ride an edge/out-of-band release. The killer use case (upstream breakage) is 100%
config/compose, so this loses nothing real.
2. **Auto-apply policy****severity-split, declarative in the envelope** (`severity` +
`auto`). `security`/`breakage` → auto-apply ON by default (defensible because the
snapshot/auto-rollback safety net exists); `compat`/`tweak` → surface + one-click, auto
only under an opt-in "auto-improve".
3. **Hotfix locality****both.** `applies_when.app` makes an artifact app-scoped (it also
surfaces on that app's page); a null app is system-wide. One field, both behaviours.
4. **Third-party — yet?****first-party only now.** The index ships with `trust:"official"`
entries; `community`/`custom` tiers just start appearing later (and gate the riskiest ops).
The "tap" mechanism is designed-in but unbuilt until there's real demand (§4 sequencing).
5. **App catalog entry point****curated Browse-&-Add** (first-party definitions as the
seed catalog), with bring-your-own-compose remaining the advanced/“custom source” path.
### 8.6 Trust & transparency (the non-negotiables, in the format)
- **Two-tier signatures** anchored on the **root-owned footprint key** (`/usr/local/lib/
libreportal/libreportal.pub`) — the manager can't swap it, so it can't bless a forgery.
- **`valid_until`** — a signed feed that simply *stops advancing* is the silent-withholding
/ targeting attack; a stale index is **refused**, not treated as "no updates". Same spirit
as the [warrant canary](../../) (freshness = signal).
- **`index_serial`** — monotonic; a lower serial than we've accepted is a rollback attack →
refused. The high-water mark is recorded locally and never lowered by a refused fetch.
- **Public + identical for everyone** — one signed feed; a targeted hotfix to a single
victim is impossible to send without it being publicly visible.
- **Nothing silent** — every apply lands in **History** with what / why / revert.
### 8.7 Build phases & status
- ✅ **Phase 1 — the signed-fetch + verify read primitive (BUILT 2026-05-31).**
- `lpVerifyMinisig` factored out of `lpFetchRelease` (`scripts/source/fetch.sh`) — the
single trust anchor now shared by releases *and* the index; `lpFetchRelease` refactored
to use it (no behaviour change).
- `scripts/source/artifacts.sh`: `lpFetchIndex` (download → **verify-before-parse** →
`valid_until` freshness → `index_serial` anti-rollback high-water → emit verified JSON),
plus parsing accessors (jq when present, grep fallback; the trust core is jq-free).
- `libreportal artifact index` (`scripts/cli/commands/artifact/`) — read-only front door
that fetches + verifies + lists. Runs directly (no mutation), like `updater check`.
- Self-tested: trust core fails closed (real key + no minisign → refuse), happy path,
stale-refused, rollback-refused, signature-refused, jq + grep parsing — 12/12.
- ⬜ **Phase 2 — the ops applier + apply verb.** `artifactApply`/`artifactApplyOps` with
the §8.2 vocabulary, per-payload sig check, snapshot → apply → verify → auto-rollback →
`updaterRecordHistory` (extend `history.json` with `artifact_id`/`serial`), wired as the
`artifact_apply` task. Makes the Vaultwarden killer use case real, first-party. *(next)*
- ⬜ **Phase 3 — WebUI surfacing.** A `webui_artifact_scan.sh` generator (clone of the
updater scan) writes `data/updater/generated/artifacts_available.json`; a "Hotfixes"
section in the Updates page reads it (graceful-absent). Hook the index fetch into the
existing update-check call site — **no second phone-home**.
- ⬜ **Phase 4 — marketplace types.** `payload.kind:"bundle"` handler (drop + scan/regen)
+ `type:"app"|"theme"|"component"` in step 4; later, the "tap" (custom source) UX.