LibrePortal/scripts/peer/peer_check.sh
librelad 3fe2c0660a feat(peers): direct peer SSH — pairing + peer-shell + pull (Phase 3)
End-to-end direct-ssh-direct: two LibrePortal instances exchange pairing
tokens, each authorizes the other to call a locked-down peer-shell dispatcher
via SSH forced-command, then either side can pull live app data from the
other without needing a shared backup repo.

Push and Connect-via-relay are deferred — push is symmetric to pull (same
forced-command, opposite verb), and the relay variant waits for Connect to
actually exist (config_json + kind enum already future-proofed in Phase 2).

Key generation (peer_key.sh):
  One ed25519 keypair per install at ~<manager>/.ssh/libreportal-peer{,.pub}.
  Generated lazily on the first peer-related call. Used as our outbound
  SSH identity AND as the pubkey other instances authorize.

Forced-command dispatcher (peer_shell.sh):
  Standalone script, deployed by peerInstallShell() to
  ~<manager>/.local/bin/peer-shell. authorized_keys entries look like:
    command="~/.local/bin/peer-shell <peer-name>",no-pty,no-port-forwarding,
    no-X11-forwarding,no-agent-forwarding,no-user-rc ssh-ed25519 AAAA… peer:<name>
  sshd hands us $SSH_ORIGINAL_COMMAND; we parse, whitelist the verb, and
  refuse anything else. Verbs:
    ping        Liveness probe (JSON ok:true).
    list-apps   JSON {peer, apps:[{slug, size_kb}]}.
    stream-app  tar of containers_dir/<slug> to stdout (slug strictly
                validated — lowercase alnum+dash; rejects path traversal).
  Audit log appended to ~/.local/state/libreportal/peer-shell.log. Excluded
  from the generated source arrays (would crash any sourcing shell on empty
  SSH_ORIGINAL_COMMAND); generate_arrays.sh skip-list extended.

Pairing token (peer_pairing.sh):
  Format: lp-peer|v1|<name>|<user>|<host>|<port>|<base64-pubkey>|<fingerprint>
  Pipe-delimited because the SHA256 fingerprint and base64 pubkey both
  contain ':'. peerPairingParse decodes + re-derives the fingerprint from
  the actual key, refusing tokens with mismatched fingerprints (catches
  truncation / tampering). peerPairingAccept:
    1. Installs peer-shell (peerInstallShell).
    2. Appends to authorized_keys with the lockdown options above.
    3. Inserts a peers row (kind=direct-ssh-direct, config carries host,
       port, user, fingerprint).
  Symmetric — user runs accept on BOTH sides with the other's token to
  enable bidirectional calls.

Outbound SSH (peer_remote.sh):
  peerExec <name> <verb> [args] — looks up the peer's connection config and
  ssh's in with the right key, BatchMode + ConnectTimeout + accept-new for
  the host key. peerPing wraps it and updates peers.status + last_seen.

Pull-an-app (peer_pull.sh):
  peerPullApp <peer> <app> [--no-pre-backup] [--keep-urls]
    1. peerPing (refuse if unreachable).
    2. migratePreBackupDestination (reuses the Phase 0 safety wrapper —
       same restic-tagged pre-migrate snapshot as the backup-channel flow).
    3. Stop + wipe destination's app folder.
    4. peerExec stream-app | tar -x (pipefail; bails on partial transfers).
    5. migrateApplyUrlRewrite + dockerComposeUpdateAndStartApp install
       (URL repointing, idempotent install path).
    6. dockerComposeUp + post-restore hooks.
  Identical Stage-2..6 to migrateApplyApp; only the data source differs
  (tar-over-SSH instead of restic-restore).

CLI (cli_peer_commands.sh + header):
  libreportal peer token                — emit this host's pairing token
  libreportal peer pair <token> [name]  — accept a token (override name)
  libreportal peer apps <peer>          — live peer-shell list-apps
  libreportal peer pull <peer> <app> [--no-pre-backup] [--keep-urls]

WebUI (/peers):
  Header gains 'Show my token' and 'Pair with token' buttons (both open
  modals around the matching CLI verbs). Token modal warns the user that
  the token is credentials. Pair modal accepts a free-form override name.
  Direct-SSH peer cards gain a 'List apps' button that opens an inline
  drawer showing the peer's live app inventory (via peer apps) with per-
  app 'Pull' buttons. Pull modal has the same two safety toggles as the
  Migrate tab (pre-backup ON, URL rewrite ON by default).
  Backup-channel manual-add modal kept; direct-SSH must use the token flow.

Smoke-tested:
  - All 16 peer-subsystem functions register without crashing the shell.
  - peer-shell ping ⇒ {ok:true}; unknown-verb refused; path-traversal slug
    refused; valid-slug streams.
  - Token emit→parse round-trip preserves every field; garbage rejected
    with not-a-token; v99 rejected with unsupported-version.
Signed-off-by: librelad <librelad@digitalangels.vip>
2026-05-26 17:56:57 +01:00

95 lines
3.5 KiB
Bash

#!/bin/bash
# Reachability check for a peer. The meaning of "reachable" depends on kind:
# backup-channel At least one snapshot from this peer's hostname is
# visible in the configured location within the last
# 30 days (or ever, if it's just been added).
# direct-ssh-direct SSH connect + 'peer-shell ping' (Phase 3).
# direct-ssh-via-relay Open relay session + 'peer-shell ping' (Phase 3b).
#
# Updates the peer's status + last_seen columns on success/failure so the UI
# can render a colored dot without re-running the check on every page load.
peerCheckReachable()
{
local name="$1"
if [[ -z "$name" ]]; then isError "peerCheckReachable: name required"; return 1; fi
local row
row=$(sqlite3 "$(_peerDb)" "SELECT id, kind, config_json FROM peers WHERE name='$(peerSqlEscape "$name")';" 2>/dev/null)
if [[ -z "$row" ]]; then
isError "No peer named '$name'"
return 1
fi
local id kind cfg
IFS='|' read -r id kind cfg <<< "$row"
local new_status="unknown"
local now
now=$(date -Iseconds)
case "$kind" in
backup-channel)
local hostname loc_idx
hostname=$(printf '%s' "$cfg" | grep -o '"hostname":"[^"]*"' | head -1 | cut -d'"' -f4)
loc_idx=$(printf '%s' "$cfg" | grep -o '"loc_idx":[0-9]*' | head -1 | cut -d':' -f2)
if [[ -z "$hostname" ]]; then
new_status="config-error"
elif [[ -z "$loc_idx" ]]; then
# No preferred location — try any enabled location.
local found=""
while IFS= read -r idx; do
[[ -z "$idx" ]] && continue
if engineSnapshotsJson "$idx" "" "$hostname" 2>/dev/null | grep -q '"short_id":'; then
found="$idx"; break
fi
done < <(resticEnabledLocations)
[[ -n "$found" ]] && new_status="ok" || new_status="no-snapshots"
else
if engineSnapshotsJson "$loc_idx" "" "$hostname" 2>/dev/null | grep -q '"short_id":'; then
new_status="ok"
else
new_status="no-snapshots"
fi
fi
;;
direct-ssh-direct)
# peerPing already updates the row + returns the status name on
# stdout. We re-read from the DB at the bottom of the function so
# callers see the same value.
new_status=$(peerPing "$name" 2>/dev/null)
[[ -z "$new_status" ]] && new_status="unreachable"
# peerPing wrote status + last_seen already; short-circuit the
# second UPDATE below.
echo "$new_status"
[[ "$new_status" == "ok" ]]
return $?
;;
direct-ssh-via-relay)
new_status="needs-connect"
;;
*)
new_status="unknown-kind"
;;
esac
sqlite3 "$(_peerDb)" \
"UPDATE peers SET status='$(peerSqlEscape "$new_status")', last_seen='$now' WHERE id=$id;" 2>/dev/null
echo "$new_status"
[[ "$new_status" == "ok" ]]
}
# Check every peer; useful for the WebUI's "Refresh" button.
peerCheckAll()
{
local name
while IFS= read -r name; do
[[ -z "$name" ]] && continue
local status
status=$(peerCheckReachable "$name")
isNotice " $name$status"
done < <(sqlite3 "$(_peerDb)" "SELECT name FROM peers ORDER BY name;" 2>/dev/null)
}