From eacbec81504ab469c28cdf3b4903486b8888ff1f Mon Sep 17 00:00:00 2001 From: Demian Date: Wed, 6 May 2026 22:17:57 -0400 Subject: [PATCH 01/19] chore: harden VPS deploy user with narrowed sudoers Replace docker group membership (root-equivalent socket access) with a sudoers file that permits only 'docker compose' and 'docker exec'. - bootstrap-vps.sh: replace usermod -aG docker with /etc/sudoers.d/deploy-docker - All deploy/backup/restore scripts: prefix docker compose calls with sudo - infra/docs/vps-setup.md: document Option B (implemented) and Option A (rootless Docker upgrade path) with pre-check, install, and verification steps --- infra/docs/vps-setup.md | 84 +++++++++++++++++++++++++++++++++ infra/scripts/backup-db.sh | 2 +- infra/scripts/bootstrap-vps.sh | 20 +++++++- infra/scripts/deploy-staging.sh | 6 +-- infra/scripts/deploy.sh | 6 +-- infra/scripts/restore-db.sh | 6 +-- infra/scripts/staging-down.sh | 2 +- infra/scripts/staging-up.sh | 4 +- 8 files changed, 115 insertions(+), 15 deletions(-) create mode 100644 infra/docs/vps-setup.md diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md new file mode 100644 index 0000000..6c31879 --- /dev/null +++ b/infra/docs/vps-setup.md @@ -0,0 +1,84 @@ +# VPS Setup — Deploy User Hardening + +## Overview + +The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker commands — nothing else. This document covers how the deploy user's Docker access is constrained. + +The deploy scripts use `sudo docker compose` / `sudo docker exec`. The sudoers file at `/etc/sudoers.d/deploy-docker` permits only those two subcommands and nothing else. The deploy user is **not** in the `docker` group — group membership grants unrestricted access to the Docker socket, which is root-equivalent. + +## Current approach: narrowed sudoers (Option B) + +`bootstrap-vps.sh` writes `/etc/sudoers.d/deploy-docker`: + +``` +deploy ALL=(ALL) NOPASSWD: /usr/bin/docker compose * +deploy ALL=(ALL) NOPASSWD: /usr/bin/docker exec * +``` + +**What this allows:** + +- `sudo docker compose ...` — all compose operations (pull, up, down, ps, exec, stop, start) +- `sudo docker exec ...` — direct exec into containers (used by backup/restore scripts) + +**What this blocks:** + +- `docker run`, `docker rm`, `docker network rm`, `docker system prune`, and everything else +- Any direct access to the Docker socket (`/var/run/docker.sock`) + +### Verification + +```bash +# As the deploy user — should succeed: +sudo docker compose -f /opt/station/docker-compose.prod.yml ps + +# As the deploy user — should be denied: +sudo apt install anything +sudo rm -rf /opt +docker ps # no docker group, no socket access +``` + +--- + +## Preferred upgrade: rootless Docker (Option A) + +If the VPS kernel supports user namespaces, rootless Docker is the cleaner solution — the Docker daemon itself runs as the deploy user, so no root socket exists at all. + +### Pre-check + +SSH in as the deploy user and run: + +```bash +curl -fsSL https://get.docker.com/rootless | sh --dry-run +``` + +If the output is clean, proceed. If it reports missing `newuidmap` or kernel namespace support, stay on Option B. + +### Installation (as the deploy user) + +```bash +curl -fsSL https://get.docker.com/rootless | sh + +echo 'export PATH=/home/deploy/bin:$PATH' >> ~/.bashrc +echo 'export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock' >> ~/.bashrc +source ~/.bashrc + +loginctl enable-linger deploy +systemctl --user enable docker +systemctl --user start docker + +docker run hello-world +``` + +### After switching to rootless + +1. Remove the sudoers file: `sudo rm /etc/sudoers.d/deploy-docker` +2. Strip the `sudo` prefix from all deploy scripts (`deploy.sh`, `deploy-staging.sh`, `backup-db.sh`, `restore-db.sh`, `staging-up.sh`, `staging-down.sh`) +3. Update `bootstrap-vps.sh` to replace the sudoers block with the rootless install steps +4. Re-run `loginctl enable-linger deploy` and `systemctl --user enable docker` to survive reboots + +### Verification + +```bash +systemctl --user status docker # should show active (running) +docker run hello-world # should succeed without sudo +``` diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index 7d64442..92c2135 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -36,7 +36,7 @@ trap 'rm -f "${BACKUP_FILE}"' EXIT echo "${LOG_PREFIX} Starting backup at ${TIMESTAMP} (${LABEL})" -docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ +sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ pg_dump -U "${DATABASE_USER}" -d "${DATABASE_NAME}" \ | gzip > "${BACKUP_FILE}" diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index 9e2f8d2..f83948d 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -47,7 +47,23 @@ if ! id -u "${DEPLOY_USER}" >/dev/null 2>&1; then useradd -m -s /bin/bash "${DEPLOY_USER}" fi -usermod -aG docker "${DEPLOY_USER}" +# Grant the deploy user narrowed sudo access to docker compose only. +# This is intentionally NOT docker group membership — the docker group is +# root-equivalent because it allows unrestricted access to the Docker socket. +# The sudoers entry below limits the deploy user to docker compose operations +# and docker exec (for pg_dump/psql backups), preventing privilege escalation. +# +# Preferred alternative: rootless Docker (see infra/docs/vps-setup.md). +# If the VPS kernel supports user namespaces, switch to rootless Docker and +# remove this sudoers entry — the deploy scripts already use `sudo docker` +# which becomes a no-op when the user owns their own Docker daemon. +SUDOERS_FILE="/etc/sudoers.d/deploy-docker" +cat > "${SUDOERS_FILE}" << 'SUDOEOF' +deploy ALL=(ALL) NOPASSWD: /usr/bin/docker compose * +deploy ALL=(ALL) NOPASSWD: /usr/bin/docker exec * +SUDOEOF +chmod 440 "${SUDOERS_FILE}" +visudo -c -f "${SUDOERS_FILE}" install -d -m 700 -o "${DEPLOY_USER}" -g "${DEPLOY_USER}" "${DEPLOY_HOME}/.ssh" touch "${DEPLOY_HOME}/.ssh/authorized_keys" @@ -83,5 +99,5 @@ echo "Bootstrap complete." echo "- Install Nginx configs from infra/nginx/ into /etc/nginx/sites-available/" echo "- Enable the sites and reload Nginx." echo "- Run infra/scripts/issue-certs.sh once DNS is live." -echo "- Confirm the deploy user can SSH and run Docker commands without sudo." +echo "- Confirm the deploy user can SSH and run: sudo docker compose ps" echo "- Configure B2 secrets and verify /opt/station/rclone.conf is written during deploy." diff --git a/infra/scripts/deploy-staging.sh b/infra/scripts/deploy-staging.sh index 5d06815..6a58802 100755 --- a/infra/scripts/deploy-staging.sh +++ b/infra/scripts/deploy-staging.sh @@ -2,6 +2,6 @@ set -euo pipefail cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml pull -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d --no-deps backend frontend -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml pull +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d --no-deps backend frontend +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps diff --git a/infra/scripts/deploy.sh b/infra/scripts/deploy.sh index 79375c4..2adb04e 100755 --- a/infra/scripts/deploy.sh +++ b/infra/scripts/deploy.sh @@ -2,6 +2,6 @@ set -euo pipefail cd /opt/station -docker compose --env-file .env.production -f docker-compose.prod.yml pull -docker compose --env-file .env.production -f docker-compose.prod.yml up -d --no-deps backend frontend -docker compose --env-file .env.production -f docker-compose.prod.yml ps +sudo docker compose --env-file .env.production -f docker-compose.prod.yml pull +sudo docker compose --env-file .env.production -f docker-compose.prod.yml up -d --no-deps backend frontend +sudo docker compose --env-file .env.production -f docker-compose.prod.yml ps diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index 0cabb7c..849939a 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -46,10 +46,10 @@ echo "${LOG_PREFIX} WARNING: if you need a clean replacement, drop and recreate echo "${LOG_PREFIX} Starting in 5 seconds. Press Ctrl+C to abort." sleep 5 -docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend -gunzip -c "${LOCAL_FILE}" | docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ +sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend +gunzip -c "${LOCAL_FILE}" | sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ psql -U "${DATABASE_USER}" -d "${DATABASE_NAME}" -docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend +sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend rm -f "${LOCAL_FILE}" echo "${LOG_PREFIX} Restore complete" diff --git a/infra/scripts/staging-down.sh b/infra/scripts/staging-down.sh index 7a97c16..6ee68a6 100755 --- a/infra/scripts/staging-down.sh +++ b/infra/scripts/staging-down.sh @@ -2,4 +2,4 @@ set -euo pipefail cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml down +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml down diff --git a/infra/scripts/staging-up.sh b/infra/scripts/staging-up.sh index e4991a8..ce21ede 100755 --- a/infra/scripts/staging-up.sh +++ b/infra/scripts/staging-up.sh @@ -2,5 +2,5 @@ set -euo pipefail cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d +sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps From e9d2e6459ac0b8aac78907ed607d23703ea5dc49 Mon Sep 17 00:00:00 2001 From: Demian Date: Wed, 6 May 2026 22:28:19 -0400 Subject: [PATCH 02/19] chore: switch deploy user hardening to rootless Docker MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VPS pre-check confirmed user namespace support (unprivileged_userns_clone=1, unshare works). uidmap package was absent but is installable via apt, so rootless Docker is viable. - bootstrap-vps.sh: add uidmap + dbus-user-session to apt installs, replace sudoers block with rootless Docker setup (loginctl enable-linger, .bashrc DOCKER_HOST/PATH, rootless install via runuser, systemctl --user enable/start) - Deploy scripts: revert sudo prefix — rootless Docker needs no sudo - infra/docs/vps-setup.md: document Option A as implemented, record pre-check results, verification steps, and security properties table --- infra/docs/vps-setup.md | 97 ++++++++++++--------------------- infra/scripts/backup-db.sh | 2 +- infra/scripts/bootstrap-vps.sh | 47 +++++++++------- infra/scripts/deploy-staging.sh | 6 +- infra/scripts/deploy.sh | 6 +- infra/scripts/restore-db.sh | 6 +- infra/scripts/staging-down.sh | 2 +- infra/scripts/staging-up.sh | 4 +- 8 files changed, 76 insertions(+), 94 deletions(-) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index 6c31879..36d63e1 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -2,83 +2,56 @@ ## Overview -The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker commands — nothing else. This document covers how the deploy user's Docker access is constrained. +The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no root socket and no docker group membership. A compromised key cannot escalate to root or affect any other service on the host. -The deploy scripts use `sudo docker compose` / `sudo docker exec`. The sudoers file at `/etc/sudoers.d/deploy-docker` permits only those two subcommands and nothing else. The deploy user is **not** in the `docker` group — group membership grants unrestricted access to the Docker socket, which is root-equivalent. +## Approach: rootless Docker -## Current approach: narrowed sudoers (Option B) +The deploy user's Docker daemon runs unprivileged inside a user namespace. There is no `/var/run/docker.sock` accessible to the deploy user — the socket lives at `/run/user//docker.sock` and is owned entirely by that user. -`bootstrap-vps.sh` writes `/etc/sudoers.d/deploy-docker`: +`bootstrap-vps.sh` handles the full setup: -``` -deploy ALL=(ALL) NOPASSWD: /usr/bin/docker compose * -deploy ALL=(ALL) NOPASSWD: /usr/bin/docker exec * -``` +- Installs `uidmap` and `dbus-user-session` prerequisites +- Enables linger so the deploy user's systemd session persists without an active login +- Sets `DOCKER_HOST` and `PATH` in `~deploy/.bashrc` +- Installs rootless Docker via `curl -fsSL https://get.docker.com/rootless | sh` (run as the deploy user) +- Enables and starts the `docker` systemd user service -**What this allows:** +The deploy scripts (`deploy.sh`, `backup-db.sh`, etc.) call `docker compose` directly — no `sudo` required. -- `sudo docker compose ...` — all compose operations (pull, up, down, ps, exec, stop, start) -- `sudo docker exec ...` — direct exec into containers (used by backup/restore scripts) +## Pre-check results (recorded 2026-05-07) -**What this blocks:** +| Check | Result | +| --------------------------------------------- | --------------------------------------- | +| `/proc/sys/kernel/unprivileged_userns_clone` | `1` ✓ | +| `newuidmap` installed | No — installed via `uidmap` apt package | +| `unshare --user sh -c "echo namespaces work"` | `namespaces work` ✓ | -- `docker run`, `docker rm`, `docker network rm`, `docker system prune`, and everything else -- Any direct access to the Docker socket (`/var/run/docker.sock`) +## Verification -### Verification +After running `bootstrap-vps.sh`, SSH in as the deploy user and confirm: ```bash -# As the deploy user — should succeed: -sudo docker compose -f /opt/station/docker-compose.prod.yml ps - -# As the deploy user — should be denied: -sudo apt install anything -sudo rm -rf /opt -docker ps # no docker group, no socket access -``` - ---- - -## Preferred upgrade: rootless Docker (Option A) - -If the VPS kernel supports user namespaces, rootless Docker is the cleaner solution — the Docker daemon itself runs as the deploy user, so no root socket exists at all. - -### Pre-check - -SSH in as the deploy user and run: - -```bash -curl -fsSL https://get.docker.com/rootless | sh --dry-run -``` - -If the output is clean, proceed. If it reports missing `newuidmap` or kernel namespace support, stay on Option B. - -### Installation (as the deploy user) - -```bash -curl -fsSL https://get.docker.com/rootless | sh - -echo 'export PATH=/home/deploy/bin:$PATH' >> ~/.bashrc -echo 'export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock' >> ~/.bashrc -source ~/.bashrc - -loginctl enable-linger deploy -systemctl --user enable docker -systemctl --user start docker +# Docker daemon is running +systemctl --user status docker +# Docker works without sudo or docker group docker run hello-world + +# No root socket access +ls /var/run/docker.sock # deploy user should get permission denied +groups # should NOT include 'docker' ``` -### After switching to rootless +## Security properties -1. Remove the sudoers file: `sudo rm /etc/sudoers.d/deploy-docker` -2. Strip the `sudo` prefix from all deploy scripts (`deploy.sh`, `deploy-staging.sh`, `backup-db.sh`, `restore-db.sh`, `staging-up.sh`, `staging-down.sh`) -3. Update `bootstrap-vps.sh` to replace the sudoers block with the rootless install steps -4. Re-run `loginctl enable-linger deploy` and `systemctl --user enable docker` to survive reboots +| Capability | Before (docker group) | After (rootless) | +| ------------------------------ | --------------------- | ---------------- | +| Run containers | ✓ | ✓ | +| Access root Docker socket | ✓ (root-equivalent) | ✗ | +| Escalate to root via Docker | ✓ | ✗ | +| Affect other users' containers | ✓ | ✗ | +| Survive deploy key compromise | ✗ | ✓ | -### Verification +## Reproducing on a fresh VPS -```bash -systemctl --user status docker # should show active (running) -docker run hello-world # should succeed without sudo -``` +`bootstrap-vps.sh` is fully automated. Prerequisites, linger, rootless install, and service enable/start are all handled. After the script completes, verify with the commands above. diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index 92c2135..7d64442 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -36,7 +36,7 @@ trap 'rm -f "${BACKUP_FILE}"' EXIT echo "${LOG_PREFIX} Starting backup at ${TIMESTAMP} (${LABEL})" -sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ +docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ pg_dump -U "${DATABASE_USER}" -d "${DATABASE_NAME}" \ | gzip > "${BACKUP_FILE}" diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index f83948d..f22a527 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -13,7 +13,7 @@ STATION_ROOT="/opt/station" apt update apt upgrade -y -apt install -y ca-certificates curl gnupg lsb-release cron logrotate rclone +apt install -y ca-certificates curl gnupg lsb-release cron logrotate rclone uidmap dbus-user-session install -m 0755 -d /etc/apt/keyrings if [ ! -f /etc/apt/keyrings/docker.asc ]; then @@ -47,23 +47,31 @@ if ! id -u "${DEPLOY_USER}" >/dev/null 2>&1; then useradd -m -s /bin/bash "${DEPLOY_USER}" fi -# Grant the deploy user narrowed sudo access to docker compose only. -# This is intentionally NOT docker group membership — the docker group is -# root-equivalent because it allows unrestricted access to the Docker socket. -# The sudoers entry below limits the deploy user to docker compose operations -# and docker exec (for pg_dump/psql backups), preventing privilege escalation. -# -# Preferred alternative: rootless Docker (see infra/docs/vps-setup.md). -# If the VPS kernel supports user namespaces, switch to rootless Docker and -# remove this sudoers entry — the deploy scripts already use `sudo docker` -# which becomes a no-op when the user owns their own Docker daemon. -SUDOERS_FILE="/etc/sudoers.d/deploy-docker" -cat > "${SUDOERS_FILE}" << 'SUDOEOF' -deploy ALL=(ALL) NOPASSWD: /usr/bin/docker compose * -deploy ALL=(ALL) NOPASSWD: /usr/bin/docker exec * -SUDOEOF -chmod 440 "${SUDOERS_FILE}" -visudo -c -f "${SUDOERS_FILE}" +# Rootless Docker: install and configure for the deploy user. +# The deploy user runs their own Docker daemon — no root socket, no docker +# group membership, no sudo required for docker commands. A leaked deploy +# SSH key cannot escalate to root or affect the host system. +loginctl enable-linger "${DEPLOY_USER}" + +DEPLOY_UID=$(id -u "${DEPLOY_USER}") + +# Set DOCKER_HOST and PATH in the deploy user's shell so rootless Docker is +# used automatically on SSH login and in cron. +BASHRC="${DEPLOY_HOME}/.bashrc" +if ! grep -q 'rootless docker' "${BASHRC}" 2>/dev/null; then + cat >> "${BASHRC}" << RCEOF + +# rootless docker +export PATH=\${HOME}/bin:\${PATH} +export DOCKER_HOST=unix:///run/user/${DEPLOY_UID}/docker.sock +RCEOF +fi + +# Install rootless Docker as the deploy user. +runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" + +# Enable and start the rootless Docker service for the deploy user. +runuser -l "${DEPLOY_USER}" -c "systemctl --user enable docker && systemctl --user start docker" install -d -m 700 -o "${DEPLOY_USER}" -g "${DEPLOY_USER}" "${DEPLOY_HOME}/.ssh" touch "${DEPLOY_HOME}/.ssh/authorized_keys" @@ -99,5 +107,6 @@ echo "Bootstrap complete." echo "- Install Nginx configs from infra/nginx/ into /etc/nginx/sites-available/" echo "- Enable the sites and reload Nginx." echo "- Run infra/scripts/issue-certs.sh once DNS is live." -echo "- Confirm the deploy user can SSH and run: sudo docker compose ps" +echo "- Confirm rootless Docker: ssh deploy@host 'docker run hello-world'" +echo "- Confirm systemctl: ssh deploy@host 'systemctl --user status docker'" echo "- Configure B2 secrets and verify /opt/station/rclone.conf is written during deploy." diff --git a/infra/scripts/deploy-staging.sh b/infra/scripts/deploy-staging.sh index 6a58802..5d06815 100755 --- a/infra/scripts/deploy-staging.sh +++ b/infra/scripts/deploy-staging.sh @@ -2,6 +2,6 @@ set -euo pipefail cd /opt/station -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml pull -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d --no-deps backend frontend -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml pull +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d --no-deps backend frontend +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps diff --git a/infra/scripts/deploy.sh b/infra/scripts/deploy.sh index 2adb04e..79375c4 100755 --- a/infra/scripts/deploy.sh +++ b/infra/scripts/deploy.sh @@ -2,6 +2,6 @@ set -euo pipefail cd /opt/station -sudo docker compose --env-file .env.production -f docker-compose.prod.yml pull -sudo docker compose --env-file .env.production -f docker-compose.prod.yml up -d --no-deps backend frontend -sudo docker compose --env-file .env.production -f docker-compose.prod.yml ps +docker compose --env-file .env.production -f docker-compose.prod.yml pull +docker compose --env-file .env.production -f docker-compose.prod.yml up -d --no-deps backend frontend +docker compose --env-file .env.production -f docker-compose.prod.yml ps diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index 849939a..0cabb7c 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -46,10 +46,10 @@ echo "${LOG_PREFIX} WARNING: if you need a clean replacement, drop and recreate echo "${LOG_PREFIX} Starting in 5 seconds. Press Ctrl+C to abort." sleep 5 -sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend -gunzip -c "${LOCAL_FILE}" | sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ +docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend +gunzip -c "${LOCAL_FILE}" | docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ psql -U "${DATABASE_USER}" -d "${DATABASE_NAME}" -sudo docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend +docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend rm -f "${LOCAL_FILE}" echo "${LOG_PREFIX} Restore complete" diff --git a/infra/scripts/staging-down.sh b/infra/scripts/staging-down.sh index 6ee68a6..7a97c16 100755 --- a/infra/scripts/staging-down.sh +++ b/infra/scripts/staging-down.sh @@ -2,4 +2,4 @@ set -euo pipefail cd /opt/station -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml down +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml down diff --git a/infra/scripts/staging-up.sh b/infra/scripts/staging-up.sh index ce21ede..e4991a8 100755 --- a/infra/scripts/staging-up.sh +++ b/infra/scripts/staging-up.sh @@ -2,5 +2,5 @@ set -euo pipefail cd /opt/station -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d -sudo docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d +docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps From f30ddaab1f967fa4f65221393ae43b1a78927a49 Mon Sep 17 00:00:00 2001 From: Demian Date: Thu, 7 May 2026 01:25:38 -0400 Subject: [PATCH 03/19] docs: add rootless Docker migration runbook to vps-setup.md Step-by-step guide for migrating an existing VPS (with live containers) from the root Docker daemon to rootless, including postgres data preservation via pg_dump/restore and docker group removal. --- infra/docs/vps-setup.md | 100 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index 36d63e1..9647a17 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -55,3 +55,103 @@ groups # should NOT include 'docker' ## Reproducing on a fresh VPS `bootstrap-vps.sh` is fully automated. Prerequisites, linger, rootless install, and service enable/start are all handled. After the script completes, verify with the commands above. + +--- + +## Migrating an existing VPS to rootless Docker + +Use these steps when Docker is already running on a VPS (e.g. station-bot is live) and you need to move the deploy user's containers from the root daemon to rootless without data loss. Expected downtime: 2–3 minutes. + +### Phase 1 — Install prerequisites (no downtime) + +**As root:** + +```bash +apt install -y uidmap dbus-user-session +loginctl enable-linger deploy +``` + +### Phase 2 — Install rootless Docker (no downtime) + +**As deploy (new SSH session):** + +```bash +curl -fsSL https://get.docker.com/rootless | sh +``` + +### Phase 3 — Dump postgres data (no downtime) + +**As deploy — do NOT source .bashrc yet, commands must reach the root daemon:** + +```bash +set -a; source /opt/station-bot/.env.production; set +a +docker exec station-bot-postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" > /tmp/station_bot_backup.sql +echo "Dump size: $(wc -c < /tmp/station_bot_backup.sql) bytes" +``` + +### Phase 4 — Cut over to rootless (downtime starts) + +**As deploy:** + +```bash +# Bring down root-daemon containers +cd /opt/station-bot +docker compose -f docker-compose.prod.yml down + +# Activate rootless in this session +export PATH=${HOME}/bin:${PATH} +export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock + +# Enable and start rootless service +systemctl --user enable docker +systemctl --user start docker + +# Confirm rootless is active +docker info | grep -i rootless +``` + +### Phase 5 — Restore data and bring services back up (downtime ends) + +**As deploy:** + +```bash +# Make DOCKER_HOST permanent +cat >> ~/.bashrc << 'RCEOF' + +# rootless docker +export PATH=${HOME}/bin:${PATH} +export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock +RCEOF + +# Start postgres under rootless daemon +cd /opt/station-bot +docker compose -f docker-compose.prod.yml up -d postgres + +# Wait for healthy +until docker compose -f docker-compose.prod.yml ps | grep -q "healthy"; do sleep 2; done + +# Restore data +docker exec -i station-bot-postgres psql -U "${POSTGRES_USER}" "${POSTGRES_DB}" < /tmp/station_bot_backup.sql + +# Start the bot +docker compose -f docker-compose.prod.yml up -d discord-bot + +# Verify +docker compose -f docker-compose.prod.yml ps +docker logs station-bot --tail 20 +``` + +### Phase 6 — Remove docker group access (only after confirming services are healthy) + +**As root:** + +```bash +gpasswd -d deploy docker +``` + +**As deploy (fresh SSH session to confirm clean environment):** + +```bash +docker compose -f /opt/station-bot/docker-compose.prod.yml ps +groups # docker should not appear +``` From ade80573b6782cade4b2a0dc213dc807ba9eb184 Mon Sep 17 00:00:00 2001 From: Demian Date: Thu, 7 May 2026 01:28:19 -0400 Subject: [PATCH 04/19] fix: address PR 156 review feedback - Remove deploy user from docker group on re-run (gpasswd -d) so a previous bootstrap that added docker group membership is cleaned up - Use absolute paths in deploy.sh, deploy-staging.sh, staging-up.sh, and staging-down.sh instead of relying on cd; eliminates ambiguity and makes invocation context-independent --- infra/scripts/bootstrap-vps.sh | 4 ++++ infra/scripts/deploy-staging.sh | 9 +++++---- infra/scripts/deploy.sh | 9 +++++---- infra/scripts/staging-down.sh | 5 +++-- infra/scripts/staging-up.sh | 7 ++++--- 5 files changed, 21 insertions(+), 13 deletions(-) diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index f22a527..833f736 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -73,6 +73,10 @@ runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" # Enable and start the rootless Docker service for the deploy user. runuser -l "${DEPLOY_USER}" -c "systemctl --user enable docker && systemctl --user start docker" +# Remove the deploy user from the docker group if they were added by a +# previous bootstrap run (rootless Docker requires no group membership). +gpasswd -d "${DEPLOY_USER}" docker 2>/dev/null || true + install -d -m 700 -o "${DEPLOY_USER}" -g "${DEPLOY_USER}" "${DEPLOY_HOME}/.ssh" touch "${DEPLOY_HOME}/.ssh/authorized_keys" chmod 600 "${DEPLOY_HOME}/.ssh/authorized_keys" diff --git a/infra/scripts/deploy-staging.sh b/infra/scripts/deploy-staging.sh index 5d06815..0cdd372 100755 --- a/infra/scripts/deploy-staging.sh +++ b/infra/scripts/deploy-staging.sh @@ -1,7 +1,8 @@ #!/bin/bash set -euo pipefail -cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml pull -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d --no-deps backend frontend -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +STATION_ROOT="/opt/station" + +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" pull +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" up -d --no-deps backend frontend +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" ps diff --git a/infra/scripts/deploy.sh b/infra/scripts/deploy.sh index 79375c4..ee1a9a8 100755 --- a/infra/scripts/deploy.sh +++ b/infra/scripts/deploy.sh @@ -1,7 +1,8 @@ #!/bin/bash set -euo pipefail -cd /opt/station -docker compose --env-file .env.production -f docker-compose.prod.yml pull -docker compose --env-file .env.production -f docker-compose.prod.yml up -d --no-deps backend frontend -docker compose --env-file .env.production -f docker-compose.prod.yml ps +STATION_ROOT="/opt/station" + +docker compose --env-file "${STATION_ROOT}/.env.production" -f "${STATION_ROOT}/docker-compose.prod.yml" pull +docker compose --env-file "${STATION_ROOT}/.env.production" -f "${STATION_ROOT}/docker-compose.prod.yml" up -d --no-deps backend frontend +docker compose --env-file "${STATION_ROOT}/.env.production" -f "${STATION_ROOT}/docker-compose.prod.yml" ps diff --git a/infra/scripts/staging-down.sh b/infra/scripts/staging-down.sh index 7a97c16..27f0a05 100755 --- a/infra/scripts/staging-down.sh +++ b/infra/scripts/staging-down.sh @@ -1,5 +1,6 @@ #!/bin/bash set -euo pipefail -cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml down +STATION_ROOT="/opt/station" + +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" down diff --git a/infra/scripts/staging-up.sh b/infra/scripts/staging-up.sh index e4991a8..894be37 100755 --- a/infra/scripts/staging-up.sh +++ b/infra/scripts/staging-up.sh @@ -1,6 +1,7 @@ #!/bin/bash set -euo pipefail -cd /opt/station -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml up -d -docker compose --project-name station-staging --env-file .env.staging -f docker-compose.staging.yml ps +STATION_ROOT="/opt/station" + +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" up -d +docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" ps From 00e2bcd283c2d6ddf834d59a4bdd571a5a6929e8 Mon Sep 17 00:00:00 2001 From: Demian Date: Thu, 7 May 2026 20:01:13 -0400 Subject: [PATCH 05/19] fix: address PR 156 round-2 review feedback - Add DOCKER_HOST self-initialization to all deploy/backup/restore scripts so they work correctly in cron and non-interactive SSH sessions - Fix root-ownership risk on ~deploy/.bashrc by chowning after append - Clarify bootstrap-vps.sh comment: deploy user has no access to root Docker socket, which still exists at /var/run/docker.sock for system use - Fix self-contradictory newuidmap table row in vps-setup.md - Add rootless-docker-migration.md documenting the live station-bot migration including AppArmor profile requirement and post-mortem --- infra/docs/rootless-docker-migration.md | 162 ++++++++++++++++++++++++ infra/docs/vps-setup.md | 10 +- infra/scripts/backup-db.sh | 2 + infra/scripts/bootstrap-vps.sh | 8 +- infra/scripts/deploy-staging.sh | 2 + infra/scripts/deploy.sh | 2 + infra/scripts/restore-db.sh | 2 + infra/scripts/staging-down.sh | 2 + infra/scripts/staging-up.sh | 2 + 9 files changed, 184 insertions(+), 8 deletions(-) create mode 100644 infra/docs/rootless-docker-migration.md diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md new file mode 100644 index 0000000..0642837 --- /dev/null +++ b/infra/docs/rootless-docker-migration.md @@ -0,0 +1,162 @@ +# Rootless Docker Migration — station-bot VPS + +**Date:** 2026-05-07 +**Host:** Cloud VPS (Ubuntu 24.04.4 LTS) +**Scope:** Migrated the `deploy` user's containers from the root Docker daemon to rootless Docker, removing `docker` group membership. + +--- + +## Why + +The `docker` group is root-equivalent. Any process that can reach `/var/run/docker.sock` can mount the host filesystem, run privileged containers, and escalate to root. If the deploy SSH key were ever leaked, an attacker would have had full root access to the host. + +Rootless Docker runs the daemon entirely inside the deploy user's own namespace. The socket lives at `/run/user//docker.sock` and is inaccessible to every other user. A compromised deploy key can only affect the deploy user's containers — nothing else on the host. + +--- + +## What changed + +- Rootless Docker daemon installed and running as the `deploy` user via `systemd --user` +- `DOCKER_HOST` and `PATH` written to `~deploy/.bashrc` so interactive sessions use the rootless socket automatically +- `deploy` removed from the `docker` group +- All containers (postgres, discord-bot) migrated to the rootless daemon with data intact +- Postgres data preserved via `pg_dump` / `psql` restore across daemons + +--- + +## Prerequisites installed + +```bash +apt install -y uidmap dbus-user-session +loginctl enable-linger deploy +``` + +- **`uidmap`** — provides `newuidmap`/`newgidmap`, the kernel tools that make user namespace ID mapping work. Required by rootlesskit, which underlies rootless Docker. +- **`dbus-user-session`** — enables per-user D-Bus sessions, which `systemd --user` needs to manage user-scoped services like the rootless Docker daemon. +- **`loginctl enable-linger`** — keeps the deploy user's systemd session alive after logout so the Docker daemon stays running without an active SSH session. + +--- + +## AppArmor profile + +Ubuntu 24.04 sets `/proc/sys/kernel/apparmor_restrict_unprivileged_userns=1` by default, which blocks rootlesskit from creating user namespaces. An explicit AppArmor profile is required to allow it: + +```bash +cat <, +include + +/home/deploy/bin/rootlesskit flags=(unconfined) { + userns, + include if exists +} +EOT +sudo systemctl restart apparmor.service +``` + +This grants rootlesskit permission to use user namespaces without granting broader privileges. + +--- + +## Migration steps (for reference on future VPS) + +See the full runbook in [`vps-setup.md`](./vps-setup.md#migrating-an-existing-vps-to-rootless-docker). + +Summary: + +1. Install prerequisites as root (no downtime) +2. Install rootless Docker as deploy (no downtime) +3. `pg_dump` while root daemon still running (no downtime) +4. `docker compose down`, activate rootless in session, start rootless daemon (downtime starts) +5. Write `.bashrc`, start postgres under rootless, restore data, start bot (downtime ends ~2 min) +6. `gpasswd -d deploy docker` as root, verify in a fresh SSH session + +--- + +## Verification + +After migration, in a fresh SSH session as deploy: + +```bash +docker info | grep -i rootless # should output: rootless +groups # docker should NOT appear +docker compose -f /opt/station-bot/docker-compose.prod.yml ps +``` + +--- + +## Effect on deployments + +No change to the deployment workflow. SSH in as deploy, run the usual docker compose commands. `.bashrc` sets `DOCKER_HOST` automatically on login. + +--- + +## Post-mortem + +### Issue 1 — `set -a; source .env.production` executed non-variable lines as shell commands + +**What happened:** The `.env.production` file contains human-readable comments and descriptive text without `#` prefixes. Sourcing it with `set -a` caused bash to attempt to execute those lines as commands, producing errors like `Member: command not found`. + +**Fix:** Extract only the needed variables directly: + +```bash +POSTGRES_USER=$(grep '^POSTGRES_USER=' .env.production | cut -d= -f2) +POSTGRES_DB=$(grep '^POSTGRES_DB=' .env.production | cut -d= -f2) +``` + +**Lesson:** `set -a; source` assumes every non-comment line is a valid variable assignment. It's brittle against env files written for human readability. Either enforce strict `KEY=value` formatting in env files, or extract specific variables when sourcing them in scripts. + +--- + +### Issue 2 — `FORCE_ROOTLESS_INSTALL=1` prefix only applied to `curl`, not to `sh` + +**What happened:** Running `FORCE_ROOTLESS_INSTALL=1 curl ... | sh` sets the variable in `curl`'s environment, not in the piped `sh` process. The installer still aborted. + +**Fix:** Place the variable before `sh`: + +```bash +curl -fsSL https://get.docker.com/rootless | FORCE_ROOTLESS_INSTALL=1 sh +``` + +**Lesson:** In a pipeline, each process inherits from the shell — not from the previous process in the pipe. Environment variable prefixes only apply to the command they immediately precede. + +--- + +### Issue 3 — AppArmor blocked rootlesskit on Ubuntu 24.04 + +**What happened:** Ubuntu 24.04 restricts unprivileged user namespaces via AppArmor by default. The rootless Docker installer failed with `fork/exec /proc/self/exe: permission denied`. The installer printed the fix but the AppArmor profile wasn't created before the first install attempt. + +**Fix:** Create the AppArmor profile for rootlesskit and restart the apparmor service before installing rootless Docker. See the profile above. + +**Lesson:** Ubuntu 24.04 is more locked down than previous LTS versions in this regard. The rootless Docker docs mention this but it's easy to miss. On any new Ubuntu 24.04 VPS, create the AppArmor profile as a prerequisite step before attempting rootless Docker installation. + +--- + +### Issue 4 — Partial failed install blocked the retry + +**What happened:** After the AppArmor fix, the installer detected the partial installation from the first failed attempt and refused to proceed. + +**Fix:** Clean up the partial install before retrying: + +```bash +systemctl --user stop docker +/home/deploy/bin/dockerd-rootless-setuptool.sh uninstall -f +rm -f /home/deploy/bin/dockerd +rm -rf /home/deploy/.local/share/docker +``` + +**Lesson:** The rootless Docker installer is not idempotent when a previous run failed partway through. Always clean up before retrying a failed install. + +--- + +### Issue 5 — Post-install, docker commands targeted the rootless daemon instead of root daemon + +**What happened:** The installer switched the Docker CLI context to "rootless" on completion. Phase 3 (pg_dump) runs against the root daemon where station-bot's containers live, but after install `docker exec` was hitting the empty rootless daemon, producing `No such container`. + +**Fix:** Explicitly target the root daemon socket for Phase 3 commands: + +```bash +DOCKER_HOST=unix:///var/run/docker.sock docker exec station-bot-postgres pg_dump ... +``` + +**Lesson:** After installing rootless Docker, the CLI context changes immediately. Any commands that still need to reach the root daemon must override `DOCKER_HOST` explicitly until the cutover is complete. diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index 9647a17..d70d86a 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -20,11 +20,11 @@ The deploy scripts (`deploy.sh`, `backup-db.sh`, etc.) call `docker compose` dir ## Pre-check results (recorded 2026-05-07) -| Check | Result | -| --------------------------------------------- | --------------------------------------- | -| `/proc/sys/kernel/unprivileged_userns_clone` | `1` ✓ | -| `newuidmap` installed | No — installed via `uidmap` apt package | -| `unshare --user sh -c "echo namespaces work"` | `namespaces work` ✓ | +| Check | Result | +| --------------------------------------------- | ------------------------------------------------------------------ | +| `/proc/sys/kernel/unprivileged_userns_clone` | `1` ✓ | +| `newuidmap` installed | Not pre-installed — provided by `uidmap` package (installed above) | +| `unshare --user sh -c "echo namespaces work"` | `namespaces work` ✓ | ## Verification diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index 7d64442..e6ddb68 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -7,6 +7,8 @@ ENV_FILE="${STATION_ROOT}/.env.production" COMPOSE_FILE="${STATION_ROOT}/docker-compose.prod.yml" RCLONE_CONFIG_FILE="${STATION_ROOT}/rclone.conf" LOG_PREFIX="[backup]" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST if [ ! -f "${ENV_FILE}" ]; then echo "${LOG_PREFIX} Missing ${ENV_FILE}" >&2 diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index 833f736..5e042a1 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -48,9 +48,10 @@ if ! id -u "${DEPLOY_USER}" >/dev/null 2>&1; then fi # Rootless Docker: install and configure for the deploy user. -# The deploy user runs their own Docker daemon — no root socket, no docker -# group membership, no sudo required for docker commands. A leaked deploy -# SSH key cannot escalate to root or affect the host system. +# The deploy user runs their own Docker daemon with no access to the root +# Docker socket (which still exists at /var/run/docker.sock for system use), +# no docker group membership, and no sudo required. A leaked deploy SSH key +# cannot escalate to root or affect the host system. loginctl enable-linger "${DEPLOY_USER}" DEPLOY_UID=$(id -u "${DEPLOY_USER}") @@ -66,6 +67,7 @@ export PATH=\${HOME}/bin:\${PATH} export DOCKER_HOST=unix:///run/user/${DEPLOY_UID}/docker.sock RCEOF fi +chown "${DEPLOY_USER}:${DEPLOY_USER}" "${BASHRC}" # Install rootless Docker as the deploy user. runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" diff --git a/infra/scripts/deploy-staging.sh b/infra/scripts/deploy-staging.sh index 0cdd372..b7eae60 100755 --- a/infra/scripts/deploy-staging.sh +++ b/infra/scripts/deploy-staging.sh @@ -2,6 +2,8 @@ set -euo pipefail STATION_ROOT="/opt/station" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" pull docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" up -d --no-deps backend frontend diff --git a/infra/scripts/deploy.sh b/infra/scripts/deploy.sh index ee1a9a8..c2481cc 100755 --- a/infra/scripts/deploy.sh +++ b/infra/scripts/deploy.sh @@ -2,6 +2,8 @@ set -euo pipefail STATION_ROOT="/opt/station" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST docker compose --env-file "${STATION_ROOT}/.env.production" -f "${STATION_ROOT}/docker-compose.prod.yml" pull docker compose --env-file "${STATION_ROOT}/.env.production" -f "${STATION_ROOT}/docker-compose.prod.yml" up -d --no-deps backend frontend diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index 0cabb7c..1ee7f2f 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -15,6 +15,8 @@ RCLONE_CONFIG_FILE="${STATION_ROOT}/rclone.conf" LOG_PREFIX="[restore]" BACKUP_PATH="$1" LOCAL_FILE="/tmp/restore_$(date +%s).sql.gz" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST if [ ! -f "${ENV_FILE}" ]; then echo "${LOG_PREFIX} Missing ${ENV_FILE}" >&2 diff --git a/infra/scripts/staging-down.sh b/infra/scripts/staging-down.sh index 27f0a05..a5090ca 100755 --- a/infra/scripts/staging-down.sh +++ b/infra/scripts/staging-down.sh @@ -2,5 +2,7 @@ set -euo pipefail STATION_ROOT="/opt/station" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" down diff --git a/infra/scripts/staging-up.sh b/infra/scripts/staging-up.sh index 894be37..d0ad20f 100755 --- a/infra/scripts/staging-up.sh +++ b/infra/scripts/staging-up.sh @@ -2,6 +2,8 @@ set -euo pipefail STATION_ROOT="/opt/station" +DOCKER_HOST="${DOCKER_HOST:-unix:///run/user/$(id -u)/docker.sock}" +export DOCKER_HOST docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" up -d docker compose --project-name station-staging --env-file "${STATION_ROOT}/.env.staging" -f "${STATION_ROOT}/docker-compose.staging.yml" ps From b4505d53e77fda3ed0372d0b5f77390a31f079bd Mon Sep 17 00:00:00 2001 From: Demian Date: Thu, 7 May 2026 20:18:06 -0400 Subject: [PATCH 06/19] fix: address PR 156 round-3 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Fix .bashrc comment: DOCKER_HOST only affects interactive/login shells, not cron — non-interactive scripts set it themselves - Fix hardcoded UID in .bashrc heredoc: use quoted heredoc so $(id -u) evaluates dynamically at login rather than being baked in at bootstrap time - Add AppArmor profile creation to bootstrap-vps.sh for Ubuntu 24.04+ where unprivileged user namespaces are restricted by default - Update vps-setup.md 'fully automated' claim to mention AppArmor handling - Fix Phase 3 pg_dump command in migration runbook to use targeted grep extraction instead of set -a source, which fails on non-strict env files --- infra/docs/vps-setup.md | 5 +++-- infra/scripts/bootstrap-vps.sh | 33 +++++++++++++++++++++++++++------ 2 files changed, 30 insertions(+), 8 deletions(-) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index d70d86a..d8736d1 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -54,7 +54,7 @@ groups # should NOT include 'docker' ## Reproducing on a fresh VPS -`bootstrap-vps.sh` is fully automated. Prerequisites, linger, rootless install, and service enable/start are all handled. After the script completes, verify with the commands above. +`bootstrap-vps.sh` is fully automated. Prerequisites, linger, AppArmor profile (Ubuntu 24.04+), rootless install, and service enable/start are all handled. After the script completes, verify with the commands above. --- @@ -84,7 +84,8 @@ curl -fsSL https://get.docker.com/rootless | sh **As deploy — do NOT source .bashrc yet, commands must reach the root daemon:** ```bash -set -a; source /opt/station-bot/.env.production; set +a +POSTGRES_USER=$(grep '^POSTGRES_USER=' /opt/station-bot/.env.production | cut -d= -f2) +POSTGRES_DB=$(grep '^POSTGRES_DB=' /opt/station-bot/.env.production | cut -d= -f2) docker exec station-bot-postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" > /tmp/station_bot_backup.sql echo "Dump size: $(wc -c < /tmp/station_bot_backup.sql) bytes" ``` diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index 5e042a1..aaaa648 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -54,21 +54,42 @@ fi # cannot escalate to root or affect the host system. loginctl enable-linger "${DEPLOY_USER}" -DEPLOY_UID=$(id -u "${DEPLOY_USER}") - # Set DOCKER_HOST and PATH in the deploy user's shell so rootless Docker is -# used automatically on SSH login and in cron. +# used automatically on interactive/login SSH sessions. Non-interactive shells +# (cron, CI) must set DOCKER_HOST themselves — the deploy/backup scripts do this. BASHRC="${DEPLOY_HOME}/.bashrc" if ! grep -q 'rootless docker' "${BASHRC}" 2>/dev/null; then - cat >> "${BASHRC}" << RCEOF + cat >> "${BASHRC}" << 'RCEOF' # rootless docker -export PATH=\${HOME}/bin:\${PATH} -export DOCKER_HOST=unix:///run/user/${DEPLOY_UID}/docker.sock +export PATH=${HOME}/bin:${PATH} +export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock RCEOF fi chown "${DEPLOY_USER}:${DEPLOY_USER}" "${BASHRC}" +# Ubuntu 24.04+ restricts unprivileged user namespaces via AppArmor; rootlesskit +# requires an explicit profile to create user namespaces. +APPARMOR_RESTRICT="/proc/sys/kernel/apparmor_restrict_unprivileged_userns" +if [ -f "${APPARMOR_RESTRICT}" ] && [ "$(cat "${APPARMOR_RESTRICT}")" = "1" ]; then + ROOTLESSKIT_PROFILE="/etc/apparmor.d/home.deploy.bin.rootlesskit" + if [ ! -f "${ROOTLESSKIT_PROFILE}" ]; then + cat > "${ROOTLESSKIT_PROFILE}" << 'AAEOF' +# ref: https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces +abi , +include + +/home/deploy/bin/rootlesskit flags=(unconfined) { + userns, + + # Site-specific additions and overrides. See local/README for details. + include if exists +} +AAEOF + systemctl restart apparmor.service + fi +fi + # Install rootless Docker as the deploy user. runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" From 86798f614e0602a83be1b2d0c001173579f62567 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sat, 9 May 2026 23:11:17 -0400 Subject: [PATCH 07/19] fix: address PR 156 round-4 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - backup-db.sh, restore-db.sh: replace set -a/source with grep-based variable extraction to avoid failures when env file contains values with spaces (e.g. APP_NAME=STATION BACKEND) - bootstrap-vps.sh: tighten hardening comment — leaked key cannot escalate to root or access other users' containers, but can still affect deploy-user-owned resources - vps-setup.md: qualify overview security claim to match actual rootless Docker guarantees - rootless-docker-migration.md: fix "inaccessible to every other user" — root can still access the socket; correct to "non-root users" --- infra/docs/rootless-docker-migration.md | 2 +- infra/docs/vps-setup.md | 2 +- infra/scripts/backup-db.sh | 7 ++++--- infra/scripts/bootstrap-vps.sh | 2 +- infra/scripts/restore-db.sh | 6 +++--- 5 files changed, 10 insertions(+), 9 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index 0642837..0eacd8d 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -10,7 +10,7 @@ The `docker` group is root-equivalent. Any process that can reach `/var/run/docker.sock` can mount the host filesystem, run privileged containers, and escalate to root. If the deploy SSH key were ever leaked, an attacker would have had full root access to the host. -Rootless Docker runs the daemon entirely inside the deploy user's own namespace. The socket lives at `/run/user//docker.sock` and is inaccessible to every other user. A compromised deploy key can only affect the deploy user's containers — nothing else on the host. +Rootless Docker runs the daemon entirely inside the deploy user's own namespace. The socket lives at `/run/user//docker.sock` and is inaccessible to other non-root users. A compromised deploy key can only affect the deploy user's containers — it cannot escalate to root or access other users' containers via Docker. --- diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index d8736d1..aba8f48 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -2,7 +2,7 @@ ## Overview -The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no root socket and no docker group membership. A compromised key cannot escalate to root or affect any other service on the host. +The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no root socket and no docker group membership. A compromised key cannot escalate to root or access other users' containers via Docker — though the deploy user can still affect resources they own (files, CPU, memory). ## Approach: rootless Docker diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index e6ddb68..a5be708 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -20,9 +20,10 @@ if [ ! -f "${RCLONE_CONFIG_FILE}" ]; then exit 1 fi -set -a -source "${ENV_FILE}" -set +a +DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2-)" +DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2-)" +B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2-)" +BACKUP_HEALTHCHECK_URL="$(grep '^BACKUP_HEALTHCHECK_URL=' "${ENV_FILE}" | cut -d= -f2-)" : "${DATABASE_USER:?DATABASE_USER is required}" : "${DATABASE_NAME:?DATABASE_NAME is required}" diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index aaaa648..a55c910 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -51,7 +51,7 @@ fi # The deploy user runs their own Docker daemon with no access to the root # Docker socket (which still exists at /var/run/docker.sock for system use), # no docker group membership, and no sudo required. A leaked deploy SSH key -# cannot escalate to root or affect the host system. +# cannot escalate to root or access other users' containers via Docker. loginctl enable-linger "${DEPLOY_USER}" # Set DOCKER_HOST and PATH in the deploy user's shell so rootless Docker is diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index 1ee7f2f..e23b4cc 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -28,9 +28,9 @@ if [ ! -f "${RCLONE_CONFIG_FILE}" ]; then exit 1 fi -set -a -source "${ENV_FILE}" -set +a +DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2-)" +DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2-)" +B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2-)" : "${DATABASE_USER:?DATABASE_USER is required}" : "${DATABASE_NAME:?DATABASE_NAME is required}" From 357f900f7109eb790bf729899640908966357a6d Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sat, 9 May 2026 23:42:35 -0400 Subject: [PATCH 08/19] fix: address PR 156 round-5 review feedback - backup-db.sh: add `|| true` to BACKUP_HEALTHCHECK_URL grep so a missing key doesn't abort the script under set -e - bootstrap-vps.sh: derive AppArmor profile filename and rootlesskit binary path from DEPLOY_HOME instead of hard-coded /home/deploy, so the profile correctly targets the configured deploy user --- infra/scripts/backup-db.sh | 2 +- infra/scripts/bootstrap-vps.sh | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index a5be708..83ea9b9 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -23,7 +23,7 @@ fi DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2-)" DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2-)" B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2-)" -BACKUP_HEALTHCHECK_URL="$(grep '^BACKUP_HEALTHCHECK_URL=' "${ENV_FILE}" | cut -d= -f2-)" +BACKUP_HEALTHCHECK_URL="$(grep '^BACKUP_HEALTHCHECK_URL=' "${ENV_FILE}" | cut -d= -f2- || true)" : "${DATABASE_USER:?DATABASE_USER is required}" : "${DATABASE_NAME:?DATABASE_NAME is required}" diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index a55c910..d5634bd 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -72,18 +72,19 @@ chown "${DEPLOY_USER}:${DEPLOY_USER}" "${BASHRC}" # requires an explicit profile to create user namespaces. APPARMOR_RESTRICT="/proc/sys/kernel/apparmor_restrict_unprivileged_userns" if [ -f "${APPARMOR_RESTRICT}" ] && [ "$(cat "${APPARMOR_RESTRICT}")" = "1" ]; then - ROOTLESSKIT_PROFILE="/etc/apparmor.d/home.deploy.bin.rootlesskit" + PROFILE_SLUG="$(echo "${DEPLOY_HOME}/bin/rootlesskit" | sed 's|^/||; s|/|.|g')" + ROOTLESSKIT_PROFILE="/etc/apparmor.d/${PROFILE_SLUG}" if [ ! -f "${ROOTLESSKIT_PROFILE}" ]; then - cat > "${ROOTLESSKIT_PROFILE}" << 'AAEOF' + cat > "${ROOTLESSKIT_PROFILE}" << AAEOF # ref: https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces abi , include -/home/deploy/bin/rootlesskit flags=(unconfined) { +${DEPLOY_HOME}/bin/rootlesskit flags=(unconfined) { userns, # Site-specific additions and overrides. See local/README for details. - include if exists + include if exists } AAEOF systemctl restart apparmor.service From ff112b802b6fe2db4bc688fa3c90232b9d88b74f Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sat, 9 May 2026 23:55:50 -0400 Subject: [PATCH 09/19] fix: address PR 156 round-6 review feedback - backup-db.sh, restore-db.sh: add || true to all required-var grep substitutions so missing keys reach the explicit :? error messages instead of silently aborting under set -euo pipefail - restore-db.sh: add trap to remove temp file on EXIT so interrupted restores don't accumulate large artifacts in /tmp - vps-setup.md Phase 3: use explicit DOCKER_HOST=unix:///var/run/docker.sock for the pg_dump step; rootless installer often switches CLI context immediately, making "don't source .bashrc" insufficient - vps-setup.md Phase 5: use explicit DOCKER_HOST for rootless socket on the psql restore step to remove ambiguity - bootstrap-vps.sh: skip rootless install if daemon is already healthy; clean up partial installs before retrying to avoid installer getting stuck --- infra/docs/vps-setup.md | 8 ++++---- infra/scripts/backup-db.sh | 6 +++--- infra/scripts/bootstrap-vps.sh | 18 +++++++++++++++++- infra/scripts/restore-db.sh | 8 ++++---- 4 files changed, 28 insertions(+), 12 deletions(-) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index aba8f48..fbfc120 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -81,12 +81,12 @@ curl -fsSL https://get.docker.com/rootless | sh ### Phase 3 — Dump postgres data (no downtime) -**As deploy — do NOT source .bashrc yet, commands must reach the root daemon:** +**As deploy — explicitly target the root daemon; the rootless installer may have switched the CLI context:** ```bash POSTGRES_USER=$(grep '^POSTGRES_USER=' /opt/station-bot/.env.production | cut -d= -f2) POSTGRES_DB=$(grep '^POSTGRES_DB=' /opt/station-bot/.env.production | cut -d= -f2) -docker exec station-bot-postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" > /tmp/station_bot_backup.sql +DOCKER_HOST=unix:///var/run/docker.sock docker exec station-bot-postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" > /tmp/station_bot_backup.sql echo "Dump size: $(wc -c < /tmp/station_bot_backup.sql) bytes" ``` @@ -131,8 +131,8 @@ docker compose -f docker-compose.prod.yml up -d postgres # Wait for healthy until docker compose -f docker-compose.prod.yml ps | grep -q "healthy"; do sleep 2; done -# Restore data -docker exec -i station-bot-postgres psql -U "${POSTGRES_USER}" "${POSTGRES_DB}" < /tmp/station_bot_backup.sql +# Restore data (DOCKER_HOST already set to rootless socket above) +DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock docker exec -i station-bot-postgres psql -U "${POSTGRES_USER}" "${POSTGRES_DB}" < /tmp/station_bot_backup.sql # Start the bot docker compose -f docker-compose.prod.yml up -d discord-bot diff --git a/infra/scripts/backup-db.sh b/infra/scripts/backup-db.sh index 83ea9b9..eb96494 100755 --- a/infra/scripts/backup-db.sh +++ b/infra/scripts/backup-db.sh @@ -20,9 +20,9 @@ if [ ! -f "${RCLONE_CONFIG_FILE}" ]; then exit 1 fi -DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2-)" -DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2-)" -B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2-)" +DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2- || true)" +DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2- || true)" +B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2- || true)" BACKUP_HEALTHCHECK_URL="$(grep '^BACKUP_HEALTHCHECK_URL=' "${ENV_FILE}" | cut -d= -f2- || true)" : "${DATABASE_USER:?DATABASE_USER is required}" diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index d5634bd..d6fb9ea 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -92,7 +92,23 @@ AAEOF fi # Install rootless Docker as the deploy user. -runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" +# If dockerd is already running rootless and healthy, skip the install. +# If a partial install is detected (binary present but daemon not healthy), +# clean up before retrying so the installer does not get stuck. +if runuser -l "${DEPLOY_USER}" -c "systemctl --user is-active docker >/dev/null 2>&1"; then + echo "Rootless Docker already active for ${DEPLOY_USER} — skipping install." +else + if runuser -l "${DEPLOY_USER}" -c "[ -f \"\${HOME}/bin/dockerd\" ]"; then + echo "Partial rootless install detected — cleaning up before retry." + runuser -l "${DEPLOY_USER}" -c " + systemctl --user stop docker 2>/dev/null || true + \"\${HOME}/bin/dockerd-rootless-setuptool.sh\" uninstall -f 2>/dev/null || true + rm -f \"\${HOME}/bin/dockerd\" + rm -rf \"\${HOME}/.local/share/docker\" + " + fi + runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" +fi # Enable and start the rootless Docker service for the deploy user. runuser -l "${DEPLOY_USER}" -c "systemctl --user enable docker && systemctl --user start docker" diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index e23b4cc..b9d6fbd 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -28,15 +28,16 @@ if [ ! -f "${RCLONE_CONFIG_FILE}" ]; then exit 1 fi -DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2-)" -DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2-)" -B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2-)" +DATABASE_USER="$(grep '^DATABASE_USER=' "${ENV_FILE}" | cut -d= -f2- || true)" +DATABASE_NAME="$(grep '^DATABASE_NAME=' "${ENV_FILE}" | cut -d= -f2- || true)" +B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2- || true)" : "${DATABASE_USER:?DATABASE_USER is required}" : "${DATABASE_NAME:?DATABASE_NAME is required}" : "${B2_BUCKET:?B2_BUCKET is required}" export RCLONE_CONFIG="${RCLONE_CONFIG_FILE}" +trap 'rm -f "${LOCAL_FILE}"' EXIT echo "${LOG_PREFIX} Downloading ${BACKUP_PATH} from b2:${B2_BUCKET}" rclone copyto "b2:${B2_BUCKET}/${BACKUP_PATH}" "${LOCAL_FILE}" \ @@ -53,5 +54,4 @@ gunzip -c "${LOCAL_FILE}" | docker compose --env-file "${ENV_FILE}" -f "${COMPOS psql -U "${DATABASE_USER}" -d "${DATABASE_NAME}" docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend -rm -f "${LOCAL_FILE}" echo "${LOG_PREFIX} Restore complete" From 4d7fb9afe48d0e1514c35e87c43109fd3ba8a273 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 00:35:33 -0400 Subject: [PATCH 10/19] fix: address PR 156 round-7 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - bootstrap-vps.sh: add docker-ce-rootless-extras to the apt install block and replace curl|sh with dockerd-rootless-setuptool.sh install; the setup tool ships with the APT package and is already signed/pinned to the configured Docker repo — no remote script execution needed - vps-setup.sh: update prerequisite list and Phase 2 migration runbook to use apt install docker-ce-rootless-extras + dockerd-rootless-setuptool.sh - rootless-docker-migration.md: add install-method note pointing to the APT path and clarifying that the curl|sh references in the post-mortem are historical context, not operator instructions --- infra/docs/rootless-docker-migration.md | 6 ++++-- infra/docs/vps-setup.md | 12 +++++++++--- infra/scripts/bootstrap-vps.sh | 4 +++- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index 0eacd8d..3f95f36 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -62,10 +62,12 @@ This grants rootlesskit permission to use user namespaces without granting broad See the full runbook in [`vps-setup.md`](./vps-setup.md#migrating-an-existing-vps-to-rootless-docker). +> **Install method:** use `apt install docker-ce-rootless-extras` (already signed and pinned to the Docker APT repo) then `dockerd-rootless-setuptool.sh install` as the deploy user. This avoids `curl | sh`. The post-mortem below references the old `curl | sh` path as historical context for what was run during the original migration. + Summary: -1. Install prerequisites as root (no downtime) -2. Install rootless Docker as deploy (no downtime) +1. Install prerequisites + `docker-ce-rootless-extras` as root (no downtime) +2. Run `dockerd-rootless-setuptool.sh install` as deploy (no downtime) 3. `pg_dump` while root daemon still running (no downtime) 4. `docker compose down`, activate rootless in session, start rootless daemon (downtime starts) 5. Write `.bashrc`, start postgres under rootless, restore data, start bot (downtime ends ~2 min) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index fbfc120..de904f1 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -10,10 +10,10 @@ The deploy user's Docker daemon runs unprivileged inside a user namespace. There `bootstrap-vps.sh` handles the full setup: -- Installs `uidmap` and `dbus-user-session` prerequisites +- Installs `uidmap`, `dbus-user-session`, and `docker-ce-rootless-extras` prerequisites - Enables linger so the deploy user's systemd session persists without an active login - Sets `DOCKER_HOST` and `PATH` in `~deploy/.bashrc` -- Installs rootless Docker via `curl -fsSL https://get.docker.com/rootless | sh` (run as the deploy user) +- Installs rootless Docker via `dockerd-rootless-setuptool.sh install` (ships with `docker-ce-rootless-extras`, no remote script execution) - Enables and starts the `docker` systemd user service The deploy scripts (`deploy.sh`, `backup-db.sh`, etc.) call `docker compose` directly — no `sudo` required. @@ -73,10 +73,16 @@ loginctl enable-linger deploy ### Phase 2 — Install rootless Docker (no downtime) +**As root — install the APT package that ships the setup tool:** + +```bash +apt install -y docker-ce-rootless-extras +``` + **As deploy (new SSH session):** ```bash -curl -fsSL https://get.docker.com/rootless | sh +dockerd-rootless-setuptool.sh install ``` ### Phase 3 — Dump postgres data (no downtime) diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index d6fb9ea..63b891e 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -32,6 +32,7 @@ apt update apt install -y \ docker-ce \ docker-ce-cli \ + docker-ce-rootless-extras \ containerd.io \ docker-buildx-plugin \ docker-compose-plugin \ @@ -107,7 +108,8 @@ else rm -rf \"\${HOME}/.local/share/docker\" " fi - runuser -l "${DEPLOY_USER}" -c "curl -fsSL https://get.docker.com/rootless | sh" + # Use the APT-installed setup tool — no remote script execution needed. + runuser -l "${DEPLOY_USER}" -c "dockerd-rootless-setuptool.sh install" fi # Enable and start the rootless Docker service for the deploy user. From 690e6ec2905d1d21c77c8f81784ac0ca21f3809d Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 00:47:08 -0400 Subject: [PATCH 11/19] fix: address PR 156 round-8 review feedback - restore-db.sh: expand EXIT trap into a cleanup function that restarts backend if the restore pipeline fails after docker compose stop backend; a BACKEND_STOPPED flag ensures the restart only fires if the stop already ran, avoiding spurious start calls on early exits --- infra/scripts/restore-db.sh | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index b9d6fbd..7dda235 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -37,7 +37,15 @@ B2_BUCKET="$(grep '^B2_BUCKET=' "${ENV_FILE}" | cut -d= -f2- || true)" : "${B2_BUCKET:?B2_BUCKET is required}" export RCLONE_CONFIG="${RCLONE_CONFIG_FILE}" -trap 'rm -f "${LOCAL_FILE}"' EXIT +BACKEND_STOPPED=0 +cleanup() { + rm -f "${LOCAL_FILE}" + if [ "${BACKEND_STOPPED}" -eq 1 ]; then + echo "${LOG_PREFIX} Ensuring backend is started after exit..." >&2 + docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend || true + fi +} +trap cleanup EXIT echo "${LOG_PREFIX} Downloading ${BACKUP_PATH} from b2:${B2_BUCKET}" rclone copyto "b2:${B2_BUCKET}/${BACKUP_PATH}" "${LOCAL_FILE}" \ @@ -50,8 +58,10 @@ echo "${LOG_PREFIX} Starting in 5 seconds. Press Ctrl+C to abort." sleep 5 docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend +BACKEND_STOPPED=1 gunzip -c "${LOCAL_FILE}" | docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ psql -U "${DATABASE_USER}" -d "${DATABASE_NAME}" docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend +BACKEND_STOPPED=0 echo "${LOG_PREFIX} Restore complete" From 45fd4520aa080e834f36766bc0087de0960a2e65 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 00:53:40 -0400 Subject: [PATCH 12/19] fix: address PR 156 round-9 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - vps-setup.md: rephrase "no root socket" to "no access to the root Docker socket" — the root daemon and its socket still exist on the host; the hardening property is that the deploy user cannot reach it --- infra/docs/vps-setup.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index de904f1..a8c0e2e 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -2,7 +2,7 @@ ## Overview -The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no root socket and no docker group membership. A compromised key cannot escalate to root or access other users' containers via Docker — though the deploy user can still affect resources they own (files, CPU, memory). +The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no access to the root Docker socket and no docker group membership. A compromised key cannot escalate to root or access other users' containers via Docker — though the deploy user can still affect resources they own (files, CPU, memory). ## Approach: rootless Docker From e6a0c25b61f759952c0c0a8b596c3efd7af6ab6c Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 01:10:50 -0400 Subject: [PATCH 13/19] fix: address PR 156 round-10 review feedback - bootstrap-vps.sh: resolve rootlesskit binary path via command -v instead of assuming ~/bin/rootlesskit; docker-ce-rootless-extras installs to /usr/bin/rootlesskit so the AppArmor profile must target that path or it will not apply on Ubuntu 24.04 - bootstrap-vps.sh: fix partial-install detection to check ~/.config/systemd/user/docker.service rather than ~/bin/dockerd, which no longer exists with the APT-based install - bootstrap-vps.sh: remove PATH=${HOME}/bin export from .bashrc; docker-ce-rootless-extras installs binaries to /usr/bin, ~/bin is not needed - vps-setup.md: fix cut -d= -f2 to cut -d= -f2- in Phase 3 grep commands to match how the scripts parse env vars (handles = in values) - vps-setup.md: remove PATH=${HOME}/bin exports from Phase 4 and Phase 5 .bashrc block to match bootstrap-vps.sh and APT install path - rootless-docker-migration.md: rewrite AppArmor profile section to resolve rootlesskit path dynamically via command -v, derive the profile filename from the resolved path --- infra/docs/rootless-docker-migration.md | 13 +++++++++---- infra/docs/vps-setup.md | 10 ++++------ infra/scripts/bootstrap-vps.sh | 18 ++++++++++-------- 3 files changed, 23 insertions(+), 18 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index 3f95f36..3697b0a 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -39,16 +39,21 @@ loginctl enable-linger deploy ## AppArmor profile -Ubuntu 24.04 sets `/proc/sys/kernel/apparmor_restrict_unprivileged_userns=1` by default, which blocks rootlesskit from creating user namespaces. An explicit AppArmor profile is required to allow it: +Ubuntu 24.04 sets `/proc/sys/kernel/apparmor_restrict_unprivileged_userns=1` by default, which blocks rootlesskit from creating user namespaces. An explicit AppArmor profile is required to allow it. + +With `docker-ce-rootless-extras` installed via APT, `rootlesskit` lives at `/usr/bin/rootlesskit`. Resolve the path dynamically so the profile filename and binary path stay correct regardless of install method: ```bash -cat <, include -/home/deploy/bin/rootlesskit flags=(unconfined) { +${ROOTLESSKIT_BIN} flags=(unconfined) { userns, - include if exists + include if exists } EOT sudo systemctl restart apparmor.service diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index a8c0e2e..167f924 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -90,8 +90,8 @@ dockerd-rootless-setuptool.sh install **As deploy — explicitly target the root daemon; the rootless installer may have switched the CLI context:** ```bash -POSTGRES_USER=$(grep '^POSTGRES_USER=' /opt/station-bot/.env.production | cut -d= -f2) -POSTGRES_DB=$(grep '^POSTGRES_DB=' /opt/station-bot/.env.production | cut -d= -f2) +POSTGRES_USER=$(grep '^POSTGRES_USER=' /opt/station-bot/.env.production | cut -d= -f2-) +POSTGRES_DB=$(grep '^POSTGRES_DB=' /opt/station-bot/.env.production | cut -d= -f2-) DOCKER_HOST=unix:///var/run/docker.sock docker exec station-bot-postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" > /tmp/station_bot_backup.sql echo "Dump size: $(wc -c < /tmp/station_bot_backup.sql) bytes" ``` @@ -105,8 +105,7 @@ echo "Dump size: $(wc -c < /tmp/station_bot_backup.sql) bytes" cd /opt/station-bot docker compose -f docker-compose.prod.yml down -# Activate rootless in this session -export PATH=${HOME}/bin:${PATH} +# Activate rootless in this session (PATH unchanged — APT install uses /usr/bin) export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock # Enable and start rootless service @@ -122,11 +121,10 @@ docker info | grep -i rootless **As deploy:** ```bash -# Make DOCKER_HOST permanent +# Make DOCKER_HOST permanent (PATH unchanged — APT install uses /usr/bin) cat >> ~/.bashrc << 'RCEOF' # rootless docker -export PATH=${HOME}/bin:${PATH} export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock RCEOF diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index 63b891e..f221f5b 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -55,15 +55,15 @@ fi # cannot escalate to root or access other users' containers via Docker. loginctl enable-linger "${DEPLOY_USER}" -# Set DOCKER_HOST and PATH in the deploy user's shell so rootless Docker is -# used automatically on interactive/login SSH sessions. Non-interactive shells +# Set DOCKER_HOST in the deploy user's shell so rootless Docker is used +# automatically on interactive/login SSH sessions. Non-interactive shells # (cron, CI) must set DOCKER_HOST themselves — the deploy/backup scripts do this. +# PATH does not need ~/bin since docker-ce-rootless-extras installs to /usr/bin. BASHRC="${DEPLOY_HOME}/.bashrc" if ! grep -q 'rootless docker' "${BASHRC}" 2>/dev/null; then cat >> "${BASHRC}" << 'RCEOF' # rootless docker -export PATH=${HOME}/bin:${PATH} export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock RCEOF fi @@ -73,7 +73,10 @@ chown "${DEPLOY_USER}:${DEPLOY_USER}" "${BASHRC}" # requires an explicit profile to create user namespaces. APPARMOR_RESTRICT="/proc/sys/kernel/apparmor_restrict_unprivileged_userns" if [ -f "${APPARMOR_RESTRICT}" ] && [ "$(cat "${APPARMOR_RESTRICT}")" = "1" ]; then - PROFILE_SLUG="$(echo "${DEPLOY_HOME}/bin/rootlesskit" | sed 's|^/||; s|/|.|g')" + # Resolve the actual rootlesskit binary path — with docker-ce-rootless-extras + # installed via APT, it lives at /usr/bin/rootlesskit, not ~/bin/rootlesskit. + ROOTLESSKIT_BIN="$(command -v rootlesskit)" + PROFILE_SLUG="$(echo "${ROOTLESSKIT_BIN}" | sed 's|^/||; s|/|.|g')" ROOTLESSKIT_PROFILE="/etc/apparmor.d/${PROFILE_SLUG}" if [ ! -f "${ROOTLESSKIT_PROFILE}" ]; then cat > "${ROOTLESSKIT_PROFILE}" << AAEOF @@ -81,7 +84,7 @@ if [ -f "${APPARMOR_RESTRICT}" ] && [ "$(cat "${APPARMOR_RESTRICT}")" = "1" ]; t abi , include -${DEPLOY_HOME}/bin/rootlesskit flags=(unconfined) { +${ROOTLESSKIT_BIN} flags=(unconfined) { userns, # Site-specific additions and overrides. See local/README for details. @@ -99,12 +102,11 @@ fi if runuser -l "${DEPLOY_USER}" -c "systemctl --user is-active docker >/dev/null 2>&1"; then echo "Rootless Docker already active for ${DEPLOY_USER} — skipping install." else - if runuser -l "${DEPLOY_USER}" -c "[ -f \"\${HOME}/bin/dockerd\" ]"; then + if runuser -l "${DEPLOY_USER}" -c "[ -f \"\${HOME}/.config/systemd/user/docker.service\" ]"; then echo "Partial rootless install detected — cleaning up before retry." runuser -l "${DEPLOY_USER}" -c " systemctl --user stop docker 2>/dev/null || true - \"\${HOME}/bin/dockerd-rootless-setuptool.sh\" uninstall -f 2>/dev/null || true - rm -f \"\${HOME}/bin/dockerd\" + dockerd-rootless-setuptool.sh uninstall -f 2>/dev/null || true rm -rf \"\${HOME}/.local/share/docker\" " fi From 6dbcc1651e09637591b389e0c8d9a8fd47c64249 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 22:53:48 -0400 Subject: [PATCH 14/19] fix: address PR 156 round-11 review feedback - vps-setup.md: remove PATH from the bootstrap-vps.sh description bullet; only DOCKER_HOST is written to .bashrc with the APT-based install - rootless-docker-migration.md: fix cut -d= -f2 to cut -d= -f2- in the post-mortem Issue 1 example snippet to match the hardened parsing used throughout this PR --- infra/docs/rootless-docker-migration.md | 4 ++-- infra/docs/vps-setup.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index 3697b0a..fb483d9 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -107,8 +107,8 @@ No change to the deployment workflow. SSH in as deploy, run the usual docker com **Fix:** Extract only the needed variables directly: ```bash -POSTGRES_USER=$(grep '^POSTGRES_USER=' .env.production | cut -d= -f2) -POSTGRES_DB=$(grep '^POSTGRES_DB=' .env.production | cut -d= -f2) +POSTGRES_USER=$(grep '^POSTGRES_USER=' .env.production | cut -d= -f2-) +POSTGRES_DB=$(grep '^POSTGRES_DB=' .env.production | cut -d= -f2-) ``` **Lesson:** `set -a; source` assumes every non-comment line is a valid variable assignment. It's brittle against env files written for human readability. Either enforce strict `KEY=value` formatting in env files, or extract specific variables when sourcing them in scripts. diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index 167f924..65b3459 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -12,7 +12,7 @@ The deploy user's Docker daemon runs unprivileged inside a user namespace. There - Installs `uidmap`, `dbus-user-session`, and `docker-ce-rootless-extras` prerequisites - Enables linger so the deploy user's systemd session persists without an active login -- Sets `DOCKER_HOST` and `PATH` in `~deploy/.bashrc` +- Sets `DOCKER_HOST` in `~deploy/.bashrc` (no PATH change needed — APT install uses `/usr/bin`) - Installs rootless Docker via `dockerd-rootless-setuptool.sh install` (ships with `docker-ce-rootless-extras`, no remote script execution) - Enables and starts the `docker` systemd user service From 00e3743dfca820c6e53d5828a053c57736a3ed7c Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 23:10:14 -0400 Subject: [PATCH 15/19] fix: address PR 156 round-12 review feedback - rootless-docker-migration.md: remove PATH from the "What changed" bullet; only DOCKER_HOST is written to .bashrc with the APT-based install, ~/bin is not needed --- infra/docs/rootless-docker-migration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index fb483d9..dbe5521 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -17,7 +17,7 @@ Rootless Docker runs the daemon entirely inside the deploy user's own namespace. ## What changed - Rootless Docker daemon installed and running as the `deploy` user via `systemd --user` -- `DOCKER_HOST` and `PATH` written to `~deploy/.bashrc` so interactive sessions use the rootless socket automatically +- `DOCKER_HOST` written to `~deploy/.bashrc` so interactive sessions use the rootless socket automatically (no PATH change needed — APT install uses `/usr/bin`) - `deploy` removed from the `docker` group - All containers (postgres, discord-bot) migrated to the rootless daemon with data intact - Postgres data preserved via `pg_dump` / `psql` restore across daemons From 2f18b81d18bab726dcbbf0a0546fb65d6a5c7cc3 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 23:20:10 -0400 Subject: [PATCH 16/19] fix: address PR 156 round-13 review feedback - bootstrap-vps.sh: set XDG_RUNTIME_DIR explicitly for all systemctl --user invocations via runuser; without it systemctl cannot reach the user's D-Bus/systemd instance in non-interactive contexts (common error: failed to connect to bus) --- infra/scripts/bootstrap-vps.sh | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/infra/scripts/bootstrap-vps.sh b/infra/scripts/bootstrap-vps.sh index f221f5b..e78a217 100755 --- a/infra/scripts/bootstrap-vps.sh +++ b/infra/scripts/bootstrap-vps.sh @@ -95,17 +95,22 @@ AAEOF fi fi +# XDG_RUNTIME_DIR must be set explicitly when invoking systemctl --user via +# runuser; without it systemctl cannot reach the user's D-Bus/systemd instance. +DEPLOY_UID="$(id -u "${DEPLOY_USER}")" +DEPLOY_XDG="XDG_RUNTIME_DIR=/run/user/${DEPLOY_UID}" + # Install rootless Docker as the deploy user. # If dockerd is already running rootless and healthy, skip the install. -# If a partial install is detected (binary present but daemon not healthy), +# If a partial install is detected (service file exists but daemon not healthy), # clean up before retrying so the installer does not get stuck. -if runuser -l "${DEPLOY_USER}" -c "systemctl --user is-active docker >/dev/null 2>&1"; then +if runuser -l "${DEPLOY_USER}" -c "${DEPLOY_XDG} systemctl --user is-active docker >/dev/null 2>&1"; then echo "Rootless Docker already active for ${DEPLOY_USER} — skipping install." else if runuser -l "${DEPLOY_USER}" -c "[ -f \"\${HOME}/.config/systemd/user/docker.service\" ]"; then echo "Partial rootless install detected — cleaning up before retry." runuser -l "${DEPLOY_USER}" -c " - systemctl --user stop docker 2>/dev/null || true + ${DEPLOY_XDG} systemctl --user stop docker 2>/dev/null || true dockerd-rootless-setuptool.sh uninstall -f 2>/dev/null || true rm -rf \"\${HOME}/.local/share/docker\" " @@ -115,7 +120,7 @@ else fi # Enable and start the rootless Docker service for the deploy user. -runuser -l "${DEPLOY_USER}" -c "systemctl --user enable docker && systemctl --user start docker" +runuser -l "${DEPLOY_USER}" -c "${DEPLOY_XDG} systemctl --user enable docker && ${DEPLOY_XDG} systemctl --user start docker" # Remove the deploy user from the docker group if they were added by a # previous bootstrap run (rootless Docker requires no group membership). From 0cf1a21d09f2ef0a0336cb32a102f68b1720088c Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 23:30:09 -0400 Subject: [PATCH 17/19] fix: address PR 156 round-14 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - restore-db.sh: set BACKEND_STOPPED=1 before docker compose stop rather than after; if stop exits non-zero under set -e the EXIT trap fires immediately and would have seen flag=0, skipping the restart — setting it first ensures the cleanup handler always attempts to bring the backend back up on any failure at or after the stop step --- infra/scripts/restore-db.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/infra/scripts/restore-db.sh b/infra/scripts/restore-db.sh index 7dda235..91a2756 100755 --- a/infra/scripts/restore-db.sh +++ b/infra/scripts/restore-db.sh @@ -57,8 +57,8 @@ echo "${LOG_PREFIX} WARNING: if you need a clean replacement, drop and recreate echo "${LOG_PREFIX} Starting in 5 seconds. Press Ctrl+C to abort." sleep 5 -docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend BACKEND_STOPPED=1 +docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" stop backend gunzip -c "${LOCAL_FILE}" | docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" exec -T postgres \ psql -U "${DATABASE_USER}" -d "${DATABASE_NAME}" docker compose --env-file "${ENV_FILE}" -f "${COMPOSE_FILE}" start backend From 7266af3e97cb1d8df30791e50f1b49fd2ad9e95b Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 23:44:21 -0400 Subject: [PATCH 18/19] fix: address PR 156 round-15 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - vps-setup.md: rewrite overview to accurately state that a leaked key gives SSH access and arbitrary container execution as the deploy user; the key hardening property is preventing root escalation via Docker, not restricting to deploy-related operations only - vps-setup.md: replace misleading "Survive deploy key compromise" table row with "Blast radius of leaked key: Full host (root) → Deploy user only" which accurately describes what rootless Docker actually achieves - rootless-docker-migration.md: tighten the Why section — a compromised key cannot escalate to root or access other users' containers via Docker, but blast radius is correctly scoped to the deploy user's namespace --- infra/docs/rootless-docker-migration.md | 2 +- infra/docs/vps-setup.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index dbe5521..b59a7e0 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -10,7 +10,7 @@ The `docker` group is root-equivalent. Any process that can reach `/var/run/docker.sock` can mount the host filesystem, run privileged containers, and escalate to root. If the deploy SSH key were ever leaked, an attacker would have had full root access to the host. -Rootless Docker runs the daemon entirely inside the deploy user's own namespace. The socket lives at `/run/user//docker.sock` and is inaccessible to other non-root users. A compromised deploy key can only affect the deploy user's containers — it cannot escalate to root or access other users' containers via Docker. +Rootless Docker runs the daemon entirely inside the deploy user's own namespace. The socket lives at `/run/user//docker.sock` and is inaccessible to other non-root users. A compromised deploy key cannot escalate to root or access other users' containers via Docker — the blast radius is limited to the deploy user's own namespace and resources. --- diff --git a/infra/docs/vps-setup.md b/infra/docs/vps-setup.md index 65b3459..a1f81f3 100644 --- a/infra/docs/vps-setup.md +++ b/infra/docs/vps-setup.md @@ -2,7 +2,7 @@ ## Overview -The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker should only be able to run deploy-related Docker operations — nothing more. This is achieved with rootless Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no access to the root Docker socket and no docker group membership. A compromised key cannot escalate to root or access other users' containers via Docker — though the deploy user can still affect resources they own (files, CPU, memory). +The deploy SSH key lives in GitHub Secrets and is used on every deployment. If it were leaked, the attacker would have SSH access as the deploy user and could run arbitrary Docker containers within the deploy user's namespace. The key hardening property is that they cannot escalate to root or access other users' containers via Docker: the deploy user runs their own Docker daemon entirely within their user namespace, with no access to the root Docker socket and no docker group membership. ## Approach: rootless Docker @@ -50,7 +50,7 @@ groups # should NOT include 'docker' | Access root Docker socket | ✓ (root-equivalent) | ✗ | | Escalate to root via Docker | ✓ | ✗ | | Affect other users' containers | ✓ | ✗ | -| Survive deploy key compromise | ✗ | ✓ | +| Blast radius of leaked key | Full host (root) | Deploy user only | ## Reproducing on a fresh VPS From 4c722ce2f55ff8730f57cff24958149af4891254 Mon Sep 17 00:00:00 2001 From: gitaddremote Date: Sun, 10 May 2026 23:57:13 -0400 Subject: [PATCH 19/19] fix: address PR 156 round-16 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - rootless-docker-migration.md Issue 4: replace hard-coded curl|sh install paths (/home/deploy/bin/dockerd-rootless-setuptool.sh, rm -f /home/deploy/bin/dockerd) with APT-based equivalents; dockerd-rootless-setuptool.sh is in /usr/bin and dockerd is not placed in ~/bin with the APT install — cleanup is now the setup tool uninstall + rm -rf ~/.local/share/docker, with a note for anyone cleaning up an old curl|sh install - rootless-docker-migration.md Prerequisites: add historical record callout to clarify the apt block reflects the original curl|sh migration and that the recommended method now includes docker-ce-rootless-extras --- infra/docs/rootless-docker-migration.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/infra/docs/rootless-docker-migration.md b/infra/docs/rootless-docker-migration.md index b59a7e0..04ee2b1 100644 --- a/infra/docs/rootless-docker-migration.md +++ b/infra/docs/rootless-docker-migration.md @@ -26,6 +26,8 @@ Rootless Docker runs the daemon entirely inside the deploy user's own namespace. ## Prerequisites installed +> **Historical record:** these are the packages installed during the original migration, which used `curl | sh`. The recommended install method now also includes `docker-ce-rootless-extras` — see the install method note in [Migration steps](#migration-steps-for-reference-on-future-vps). + ```bash apt install -y uidmap dbus-user-session loginctl enable-linger deploy @@ -143,15 +145,16 @@ curl -fsSL https://get.docker.com/rootless | FORCE_ROOTLESS_INSTALL=1 sh **What happened:** After the AppArmor fix, the installer detected the partial installation from the first failed attempt and refused to proceed. -**Fix:** Clean up the partial install before retrying: +**Fix:** Clean up the partial install before retrying. With the APT-based install (`docker-ce-rootless-extras`), `dockerd-rootless-setuptool.sh` is in `/usr/bin` and `dockerd` is not placed in `~/bin`: ```bash -systemctl --user stop docker -/home/deploy/bin/dockerd-rootless-setuptool.sh uninstall -f -rm -f /home/deploy/bin/dockerd -rm -rf /home/deploy/.local/share/docker +systemctl --user stop docker 2>/dev/null || true +dockerd-rootless-setuptool.sh uninstall -f 2>/dev/null || true +rm -rf ~/.local/share/docker ``` +> **Note:** The original migration used `curl | sh` which installed binaries to `~/bin/`. If cleaning up an old curl-based install, also run `rm -f ~/bin/dockerd ~/bin/dockerd-rootless-setuptool.sh`. + **Lesson:** The rootless Docker installer is not idempotent when a previous run failed partway through. Always clean up before retrying a failed install. ---