Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions infra/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# infra/

Scripts for renting and provisioning a Scaleway Elastic Metal server for
benchmark / feature-testing work.

| Script | Runs on | Purpose |
|----------------------------|---------|--------------------------------------------------------------------------------------------------------|
| `rent_baremetal.sh <name>` | local | Creates the server (hourly billing, Debian, fr-par-2) and hands off to `provision_server.sh`. |
| `provision_server.sh <ip>` | local | Waits for sshd, runs `provision.sh` on the server over SSH. Re-runnable. |
| `provision.sh` | remote | Installs toolchain, creates `admin`/`app` users, clones `lambda_vm`, hardens sshd. |

## Prerequisites

1. Install `scw` and `jq`:
```bash
brew install scw jq # macOS
```

2. Create the `vm` scw profile (script refuses any other profile name):
```bash
scw init --profile vm
```

## Rent + provision a new server

```bash
infra/rent_baremetal.sh test-1
```

End to end in one command — creates the server, waits for both
`status=ready` and `install.status=completed`, then provisions it (apt
packages, `admin`/`app` users, Rust toolchain, gh CLI, Claude Code,
lambda-vm sysroot, repo clone, ssh hardening).

Use a unique name (Scaleway rejects duplicates):

```bash
infra/rent_baremetal.sh <server_name>
```

After it finishes, log in as:

```bash
ssh admin@<ip> # passwordless sudo
ssh app@<ip> # workload user, no sudo, has ~/lambda_vm cloned
```

Root SSH is disabled at the end of provisioning.

## Re-provision an existing server

If `provision.sh` failed partway, or you want to re-apply changes, point
`provision_server.sh` at the IP directly. It's idempotent.

```bash
# Before hardening (root still works):
infra/provision_server.sh <ip>

# After hardening (root SSH is dead, use admin):
SSH_USER=admin infra/provision_server.sh <ip>
```

The wrapper switches to `sudo bash -s` automatically when `SSH_USER` isn't
root.

## Configuration

Everything has a working default; override via env var only when needed.

| Var | Default | Used by | Notes |
|---|---|---|---|
| `SCW_TYPE` | `EM-I320E-NVME` | `rent_baremetal.sh` | Scaleway commercial type. Must have an `hourly` offer in `$SCW_ZONE` or the script refuses. |
| `SCW_ZONE` | `fr-par-2` | `rent_baremetal.sh` | One of `fr-par-1`, `fr-par-2`, `nl-ams-1`, `nl-ams-2`, `pl-waw-2`, `pl-waw-3`. |
| `SCW_OS_ID` | `83640d93-...` (Debian 12) | `rent_baremetal.sh` | Must have `cloud_init_supported: true`. |
| `SCW_PROJECT_ID` | `946cfb34-...` (lambda_vm) | `rent_baremetal.sh` | Determines which scw IAM SSH keys get installed. |
| `READY_TIMEOUT` | `1800` (s) | `rent_baremetal.sh` | How long to wait for `status=ready && install.status=completed`. |
| `PROVISION_FILE` | `<script_dir>/provision.sh` | both wrappers | Path to the remote provisioning script. |
| `SSH_USER` | `root` | `provision_server.sh` | Switch to `admin` for re-runs after sshd hardening. |

### `SCW_TYPE` options

| Type | CPU | RAM | Disk | Price (€/h) |
|---|---|---|---|---|
| `EM-I220E-NVME` | AMD EPYC 8124P (16c/32t @ 2.5 GHz) | 128 GB | 2× 960 GB NVMe | 0.548 |
| `EM-I320E-NVME` | AMD EPYC 8224P (24c/48t @ 2.5 GHz) | 192 GB | 2× 1.92 TB NVMe | 0.822 (default) |
| `EM-I420E-NVME` | AMD EPYC 8324P (32c/64t @ 2.6 GHz) | 256 GB | 2× 1.92 TB NVMe | 1.096 |

### `SCW_ZONE` options

| Zone | Location |
|---|---|
| `fr-par-1` | Paris, France |
| `fr-par-2` | Paris, France (default) |
| `nl-ams-1` | Amsterdam, Netherlands |
| `nl-ams-2` | Amsterdam, Netherlands |
| `pl-waw-2` | Warsaw, Poland |
| `pl-waw-3` | Warsaw, Poland |
181 changes: 181 additions & 0 deletions infra/provision.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
#!/bin/bash
# Provision a freshly rented Scaleway Elastic Metal Debian server.
# Invoked remotely from infra/provision_server.sh as:
# ssh root@<ip> bash -s < infra/provision.sh
#
# Idempotent — safe to re-run.

set -euo pipefail

log() { printf '\n=== %s ===\n' "$*"; }

# --- 1. apt update + upgrade -------------------------------------------------
log "apt update + upgrade"
export DEBIAN_FRONTEND=noninteractive
APT_OPTS=(-y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold)

# Scaleway baremetal Debian ships grub-cloud-amd64; its postinst (fired as a
# trigger by initramfs-tools / shim-signed / kernel upgrades) runs grub-install
# against an ext2 root and fails ("will not proceed with blocklists"). The
# package isn't load-bearing on UEFI baremetal — purge it before any upgrade.
apt-get purge -y grub-cloud-amd64 2>/dev/null || true

apt-get update -y
apt-get upgrade "${APT_OPTS[@]}"

# --- 2. apt packages ---------------------------------------------------------
log "apt install base packages + clang/lld/llvm + xz-utils"
apt-get install "${APT_OPTS[@]}" \
ca-certificates curl wget gnupg vim git zip unzip openssl libssl-dev jq \
build-essential rsyslog htop rsync pkg-config locales ufw \
clang lld llvm xz-utils

# --- 3. users: admin (sudo) + app (no sudo) ----------------------------------
log "users: admin (sudo) + app (no sudo)"
for u in admin app; do
if ! id "$u" >/dev/null 2>&1; then
useradd -m -s /bin/bash "$u"
fi
done
echo 'admin ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/90-admin
chmod 0440 /etc/sudoers.d/90-admin

# --- 4. authorized_keys for admin and app ------------------------------------
log "authorized_keys: propagate root's keys + append hardcoded team keys"
if [ ! -s /root/.ssh/authorized_keys ]; then
echo "ERROR: /root/.ssh/authorized_keys missing or empty — refusing to harden sshd." >&2
exit 1
fi
TEAM_KEYS=(
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFzvQKhE/xqRxHbit/dZNej7T5eVLmF8CAGL7to6o3QY joaquin@mail.com"
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA2GAeixuqP4XwujuSK9KDgdmyglGzlQQsXztnve+bra gabriel@mail.com"
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKQnPPUb4gzmsmjDP98mNKXbpHrp9bIIL7QiRjyWEG6f julian@mail.com"
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBzniAUYGJXguBjfz2+uGUUC7XLVmk58FhCsEBMx2r5k mauro@mail.com"
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJ6mrcWIyU+/LrNZLivNIOYr6ld/CXefoq1hyXLsHDfV it"
)
for u in admin app; do
install -d -m 0700 -o "$u" -g "$u" "/home/$u/.ssh"
install -m 0600 -o "$u" -g "$u" /root/.ssh/authorized_keys "/home/$u/.ssh/authorized_keys"
AUTH_FILE="/home/$u/.ssh/authorized_keys"
if [ -n "$(tail -c 1 "$AUTH_FILE")" ]; then
printf '\n' >> "$AUTH_FILE"
fi
for key in "${TEAM_KEYS[@]}"; do
if ! grep -qxF "$key" "$AUTH_FILE"; then
printf '%s\n' "$key" >> "$AUTH_FILE"
fi
done
chown "$u:$u" "$AUTH_FILE"
done

# --- 5. GitHub CLI (gh) -----------------------------------------------------
if ! command -v gh >/dev/null 2>&1; then
log "installing gh (GitHub CLI)"
mkdir -p -m 755 /etc/apt/keyrings
out=$(mktemp)
wget -nv -O "$out" https://cli.github.com/packages/githubcli-archive-keyring.gpg
cat "$out" > /etc/apt/keyrings/githubcli-archive-keyring.gpg
chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg
mkdir -p -m 755 /etc/apt/sources.list.d
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \
> /etc/apt/sources.list.d/github-cli.list
apt-get update -y
apt-get install "${APT_OPTS[@]}" gh
fi

# --- 6. Rust toolchain for app (1.94.0 default + nightly-2026-02-01 + src) ---
log "Rust 1.94.0 + nightly-2026-02-01 (rust-src) for app"
sudo -u app -H bash -se <<'APP_RUST'
set -euo pipefail
if ! command -v rustup >/dev/null 2>&1; then
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \
| sh -s -- -y --default-toolchain 1.94.0 --profile default
fi
export PATH="$HOME/.cargo/bin:$PATH"
grep -q 'cargo/env' "$HOME/.bashrc" 2>/dev/null \
|| echo '. "$HOME/.cargo/env"' >> "$HOME/.bashrc"
rustup toolchain install nightly-2026-02-01 --profile minimal --component rust-src
rustup component add rust-analyzer
APP_RUST

# --- 7. Claude Code for app -------------------------------------------------
log "Claude Code for app"
sudo -u app -H bash -se <<'APP_CLAUDE'
set -euo pipefail
export PATH="$HOME/.local/bin:$PATH"
if ! command -v claude >/dev/null 2>&1; then
curl -fsSL https://claude.ai/install.sh | bash
Comment thread
JuArce marked this conversation as resolved.
fi
PATH_LINE='export PATH="$HOME/.local/bin:$PATH"'
grep -qxF "$PATH_LINE" "$HOME/.bashrc" 2>/dev/null \
|| printf '%s\n' "$PATH_LINE" >> "$HOME/.bashrc"
APP_CLAUDE

# --- 8. lambda-vm sysroot (rv64im) ------------------------------------------
SYSROOT_DIR=/opt/lambda-vm-sysroot
SYSROOT_URL=https://lambda.alignedlayer.com/lambda-vm-sysroot-rv64im.tar.gz
if [ ! -d "$SYSROOT_DIR" ]; then
log "downloading sysroot to $SYSROOT_DIR"
curl -L "$SYSROOT_URL" -o /tmp/sysroot.tar.gz
Comment thread
JuArce marked this conversation as resolved.
mkdir -p /opt
tar -xzf /tmp/sysroot.tar.gz -C /opt
rm /tmp/sysroot.tar.gz
fi

# --- 9. Clone lambda_vm (as app, public repo over HTTPS) ---------------------
REPO_DIR=/home/app/lambda_vm
REPO_URL=https://github.com/yetanotherco/lambda_vm.git
if [ ! -d "$REPO_DIR/.git" ]; then
log "cloning lambda_vm to $REPO_DIR (as app)"
sudo -u app -H git clone "$REPO_URL" "$REPO_DIR"
fi

# --- 10. ethrex test fixture ------------------------------------------------
ETHREX_FILE=/home/app/lambda_vm/executor/tests/ethrex_hoodi.bin
ETHREX_URL=https://lambda.alignedlayer.com/ethrex_hoodi.bin
if [ -d /home/app/lambda_vm/executor/tests ] && [ ! -f "$ETHREX_FILE" ]; then
log "downloading ethrex_hoodi.bin"
sudo -u app -H curl -L "$ETHREX_URL" -o "$ETHREX_FILE"
fi

# --- 11. ufw firewall (default deny in, allow out, only ssh in) -------------
log "ufw: default deny in / allow out, allow ssh (22/tcp) only"
ufw --force reset >/dev/null
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp
ufw --force enable

# --- 12. /etc/environment + locale ------------------------------------------
log "writing /etc/environment"
cat > /etc/environment <<'EOF'
LANG=en_US.UTF-8
LC_ALL=C
LANGUAGE=en_US.UTF-8
LC_TYPE=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
EOF
Comment thread
JuArce marked this conversation as resolved.
locale-gen en_US.UTF-8

# --- 13. sshd hardening (last; reload won't drop existing session) ----------
log "writing /etc/ssh/sshd_config.d/99-hardening.conf"
cat > /etc/ssh/sshd_config.d/99-hardening.conf <<'EOF'
PermitRootLogin no
PasswordAuthentication no
AllowAgentForwarding no
AllowTcpForwarding no
PubkeyAuthentication yes
MaxAuthTries 5
LoginGraceTime 30
ClientAliveInterval 300
ClientAliveCountMax 2
X11Forwarding no
PermitEmptyPasswords no
PermitUserEnvironment no
LogLevel VERBOSE
EOF
chmod 0644 /etc/ssh/sshd_config.d/99-hardening.conf
sshd -t
systemctl reload ssh

log "Done. Log in as admin@ (sudo) or app@ (no sudo). Root SSH is disabled."
74 changes: 74 additions & 0 deletions infra/provision_server.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/bin/bash
# Run infra/provision.sh on a remote Scaleway baremetal server over SSH.
# Safe to run standalone after rent_baremetal.sh, or to re-provision an
# existing server.
#
# Usage: infra/provision_server.sh <ip>
#
# Env var overrides (all optional):
# SSH_USER default: root
# First-run servers accept root SSH; once provision.sh
# has hardened sshd, re-run as: SSH_USER=admin ...
# PROVISION_FILE default: <script-dir>/provision.sh
#
# SSH wait is indefinite — Ctrl+C to abort.

set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BOLD='\033[1m'
NC='\033[0m'

SSH_USER="${SSH_USER:-root}"
PROVISION_FILE="${PROVISION_FILE:-$SCRIPT_DIR/provision.sh}"

err() { echo -e "${RED}error:${NC} $*" >&2; }
info() { echo -e "${BOLD}$*${NC}"; }
ok() { echo -e "${GREEN}$*${NC}"; }

if [ $# -lt 1 ] || [ -z "${1:-}" ]; then
err "missing <ip>"
echo "Usage: $0 <ip>" >&2
exit 2
fi
IP="$1"

if [ ! -r "$PROVISION_FILE" ]; then
err "provision script not found or unreadable: $PROVISION_FILE"
exit 1
fi
if ! command -v ssh >/dev/null 2>&1; then
err "ssh not found on PATH."
exit 1
fi

SSH_OPTS=(-o StrictHostKeyChecking=accept-new -o ConnectTimeout=10 -o BatchMode=yes)
Comment thread
JuArce marked this conversation as resolved.

info "Waiting for sshd on $SSH_USER@$IP (indefinite — Ctrl+C to abort)..."
attempt=1
while ! ssh "${SSH_OPTS[@]}" "$SSH_USER@$IP" true 2>/dev/null; do
if [ $((attempt % 6)) -eq 0 ]; then
echo -e " ${YELLOW}still waiting (attempt $attempt, ~$((attempt * 10))s elapsed)${NC}"
fi
attempt=$((attempt + 1))
sleep 10
done
ok "sshd reachable on $SSH_USER@$IP (attempt $attempt)"

if [ "$SSH_USER" = "root" ]; then
REMOTE_CMD="bash -s"
else
REMOTE_CMD="sudo bash -s"
fi

info "Running $PROVISION_FILE on $SSH_USER@$IP..."
ssh "${SSH_OPTS[@]}" "$SSH_USER@$IP" "$REMOTE_CMD" < "$PROVISION_FILE"

echo
ok "Provisioning complete."
echo " ssh admin@$IP # sudo"
echo " ssh app@$IP # no sudo"
Loading
Loading