Skip to content

feat(cloud): device-code workstation enrollment#14

Merged
jirhiker merged 2 commits into
mainfrom
claude/jolly-kirch-89a1f6
May 9, 2026
Merged

feat(cloud): device-code workstation enrollment#14
jirhiker merged 2 commits into
mainfrom
claude/jolly-kirch-89a1f6

Conversation

@jirhiker
Copy link
Copy Markdown

@jirhiker jirhiker commented May 9, 2026

Summary

Adds an RFC 8628-style device-code grant for onboarding workstations to pychronAPI without requiring email access on the lab machine.

Flow:

  1. Tech clicks Start device-code enrollment in Pychron Cloud preferences.
  2. Pychron POSTs /api/v1/forgejo/device-codes → server returns a polling secret (device_code) + a short admin-typed code (user_code, e.g. ABCD-EFGH) + a verification URL.
  3. Pane shows the user_code and URL; tech reads them aloud to admin (or admin sees them on the workstation screen).
  4. Admin signs in on any browser-capable device, enters the code, picks lab + technician email + scopes, approves.
  5. Workstation's polling thread sees success → persists keypair + SSH config + OS-keyring token. Status flips to Enrolled as <lab>.

No email anywhere; works on air-gapped lab machines as long as the workstation can reach pychronAPI.

Files

  • pychron/cloud/api_client.pystart_device_code + poll_device_code w/ typed errors (CloudDeviceCodePending, CloudDeviceCodeDenied, CloudDeviceCodeExpired, CloudFingerprintRejected). Both endpoints are unauthenticated (the device_code is the polling credential); no Authorization header is sent. Plaintext device_code, user_code, and api_token are stripped from .raw before the result is exposed.
  • pychron/cloud/workstation_setup.pyWorkstationSetup.from_device_code classmethod orchestrates start → on_user_code callback → poll loop → persist registration + SSH config + keyring. should_cancel callback lets the UI break the loop mid-poll. Keyring failure raises KeyringWriteFailedError whose __str__ deliberately omits the token; .api_token and .lab_name attributes carry the still-in-memory plaintext for UI display.
  • pychron/cloud/tasks/preferences.py — new Enroll via Device Code group. Worker thread runs from_device_code; dispatches all UI updates back to the main thread via pyface.api.GUI.invoke_later so BasePreferencesHelper's persistence listeners stay single-threaded. Live _pending_user_code + _pending_verification_url fields shown while polling. On keyring failure, _recovery_token field surfaces the token so the tech can paste it into a password manager before closing the window.
  • test/cloud/test_api_client.pyTestStartDeviceCode (9) + TestPollDeviceCode (12) cover happy, status mapping, no-Authorization, secrets stripped from .raw, transport / non-JSON, empty-arg validation.
  • test/cloud/test_device_code_setup.py — orchestrator tests: pending → success persists all artifacts + keyring; denied / expired propagate without persisting; should_cancelDeviceEnrollmentCancelled; keyring-fail → KeyringWriteFailedError (token on attrs, NOT in str); empty api_base_url aborts before any I/O.

Server-side prereq

The companion server endpoints (/api/v1/forgejo/device-codes start/approve/deny/poll/list, the ForgejoDeviceCode table + alembic migration 0004, the shared mint_workstation_credential helper) live in pychronAPI and need to land first. This PR is the client-only half — without the server endpoints it'll get 404s. The server work is a separate PR in the pychronAPI repo.

Test plan

  • Full test/cloud/ suite: 130 / 130 green.
  • Local manual smoke: start enrollment → see code in pane → approve via test admin endpoint on staging pychronAPI → verify keypair, registration.json, ssh config, keyring all populated.
  • Manual: cancel mid-poll → status shows "Enrollment cancelled".
  • Manual: simulate keyring failure (revoke macOS Keychain access) → recovery token field surfaces with the plaintext for copy.

🤖 Generated with Claude Code

jirhiker and others added 2 commits May 9, 2026 16:31
Adds an RFC 8628-style device-code grant for onboarding workstations to
pychronAPI without requiring email access on the lab machine. The
technician clicks a button; pychron displays a short user_code + a
verification URL. The admin signs in on any browser-capable device,
types the code, picks lab + scopes + technician email, and approves.
The workstation's polling thread sees success and persists keypair +
SSH config + OS-keyring token.

Client side:

- pychron/cloud/api_client.py — start_device_code and poll_device_code
  with typed errors (CloudDeviceCodePending, CloudDeviceCodeDenied,
  CloudDeviceCodeExpired, CloudFingerprintRejected). Both endpoints
  unauthenticated; no Authorization header sent. Plaintext
  device_code, user_code, and api_token are stripped from .raw before
  exposure to keep bearer secrets out of any caller's debug logs.

- pychron/cloud/workstation_setup.py — WorkstationSetup.from_device_code
  classmethod orchestrates start → on_user_code callback → poll loop
  → persist registration + SSH config + keyring. should_cancel
  callback lets the UI cancel mid-poll. Keyring failure raises
  KeyringWriteFailedError whose __str__ deliberately omits the token;
  .api_token and .lab_name attributes carry it for UI display.

- pychron/cloud/tasks/preferences.py — new "Enroll via Device Code"
  group in CloudPreferences. Worker thread runs from_device_code,
  dispatches all UI updates back to the main thread via
  pyface.api.GUI.invoke_later so BasePreferencesHelper's persistence
  listeners stay single-threaded. Live _pending_user_code +
  _pending_verification_url fields shown while polling. On keyring
  failure, _recovery_token field surfaces the still-in-memory token
  so the technician can paste it into a password manager.

- test/cloud/test_api_client.py — TestStartDeviceCode (9) +
  TestPollDeviceCode (12) cover happy paths, status mapping, missing
  Authorization header, secrets stripped from raw, transport / non-JSON
  failures, empty-arg validation.

- test/cloud/test_device_code_setup.py — orchestrator tests:
  pending-then-success persists all artifacts and writes keyring;
  denied/expired propagate without persisting; should_cancel raises
  DeviceEnrollmentCancelled; keyring-failure raises
  KeyringWriteFailedError with token on attributes (NOT in str);
  empty api_base_url aborts before any I/O.

The companion server-side endpoints (/api/v1/forgejo/device-codes
start/approve/deny/poll/list, the ForgejoDeviceCode table + migration,
the shared mint_workstation_credential helper) live in pychronAPI and
need to land before this client is functional in production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI black version (26.3.1) collapses the double blank line between the
last import group and the first module-level statement to a single
blank line. Local pre-commit hook had an older black, didn't catch it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jirhiker jirhiker merged commit d60257a into main May 9, 2026
4 checks passed
jirhiker added a commit that referenced this pull request May 10, 2026
Address path-traversal finding from /security-review on PR #14.

`make_qr_for_device_code(host_slug=...)` previously interpolated the
slug directly into the output filename (`device_<slug>.png`). The
caller in `tasks/preferences.py` passes `self.lab_name`, which can
hold an attacker-influenced value (server-issued after a prior
enrollment, or a hand-edited preference) carrying `..` / `/` / null
bytes. The resulting file would land outside the scoped
`~/.pychron/qr/` directory.

Two-layer fix:

1. `_sanitize_slug` whitelists `[A-Za-z0-9_-]` at function entry; any
   other byte becomes `_`. This collapses traversal payloads to
   inert filenames before they reach the path layer.

2. After path construction, `os.path.realpath(out_path)` is asserted
   to live under `os.path.realpath(qr_dir())`. A defense-in-depth
   guard so any future slug-handling regression cannot escape the
   scoped directory.

Tests cover the traversal payloads (`../../etc/passwd`, `..`,
`../../tmp/owned`, `a/b/c`, `lab.name`, `foo\x00bar`) plus the
preservation of normal slugs (`NMGRL`, `lab-2024_NM`).

138/138 cloud tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant