feat(skills): track upstream for user-imported GitHub skills by pjdoland · Pull Request #322 · plmbr/notebook-intelligence

pjdoland · 2026-05-19T02:28:54Z

Summary

User-imported GitHub skills can now opt into tracking upstream. The Import-from-GitHub dialog has a new Track upstream checkbox; ticking it stamps the installed skill with a tracks_upstream frontmatter flag and lights up a per-skill Sync button (and a panel-level Sync tracking skills batch action).

Sync re-fetches the bundle from the recorded source URL and replaces the on-disk content. Distinct from the existing managed-skills reconciler: tracking is opt-in per skill, never auto-scheduled, never auto-removed. The bundle on disk is preserved across every failure path (probe failure, tarball fetch failure, upstream repo deletion); the user clicked Sync, they're shown the error, and the existing bundle stays put.

Solution highlights

Schema (skillset.py): two new optional frontmatter keys, tracks_upstream and tracking_ref. Default falsy and omitted from serialized SKILL.md when unset, so existing user-imported skills are unchanged on disk.
SkillManager.sync_tracking_skill(scope, name): probes the commits API for the latest SHA at the recorded source, short-circuits when the SHA matches tracking_ref (no tarball fetch, no rmtree). When the SHA differs, stages the new bundle, atomically replaces, stamps the new ref. Probe failure raises rather than silently re-downloading. Tarball-fetch failure preserves the existing bundle on disk (rmtree only happens after staging succeeds).
HTTP: POST /skills/{scope}/{name}/sync and POST /skills/sync-all-tracking route through run_in_executor. The batch endpoint runs per-skill syncs through an asyncio.Semaphore(3) so N skills don't take N * 15s in the worst case. Both gated by allow_github_skill_import (same network egress, same trust boundary). PUT /skills/{scope}/{name} accepts a tracks_upstream field for toggling on existing skills; that toggle is also gated by allow_github_skill_import so an admin's kill switch can't leak.
Shared SHA helper (skill_github_import.py): new resolve_desired_sha(ref_info, token) consolidates the "if 40-char SHA short-circuit else probe" dance that the managed reconciler and the new sync path both run.
Frontend: a "Tracking" badge (distinct from "Managed" by color and tooltip), a per-row sync button with disabled-state styling, a panel toolbar batch button, and an editor toggle for opting in/out post-import. The toggle flips optimistically and rolls back on PUT failure so the checkbox always reflects server state.

Mutual exclusion

A skill cannot be both managed and tracking. The manager refuses to enable tracking on a managed skill (ValueError → 400 at the HTTP layer). The two metadata pairs (managed_source/managed_ref vs tracks_upstream/tracking_ref) are independent on the wire but enforced exclusive at the manager.

Never-delete invariants

The "never delete" contract is enforced by the call ordering:

rmtree(skill.root_path) runs only after stage_skill_from_github succeeds (which is the only step that can throw on a network or 404 error).
A probe failure raises before rmtree is even reached.
The reconciler is not extended to touch tracking skills; only the manual sync action can mutate them.

Testing

tests/test_skill_manager.py: 12 new cases in TestTrackUpstream (import opt-in, toggle on/off via update_skill, sync happy path, sync skips on matching SHA, sync raises on probe failure, sync rejects non-tracking/managed/missing skills, sync preserves bundle on staging failure, list_tracking_skills filter) plus 3 cases in TestManagedFrontmatter for the schema fields.
tests/test_skills_handlers.py: 9 new cases (SkillSyncHandler happy path, unchanged short-circuit, missing-skill 404, non-tracking rejection, 403 when imports disabled, 502 on probe failure, plus 4 PUT-toggle cases for managed/no-source/imports-disabled rejection and the regression pin that omitting tracks_upstream preserves the current value) plus 3 cases on SkillsSyncAllTrackingHandler.
tests/ts/skill-wire-format.test.ts (new): 4 cases pinning the skillFromWire snake_case ↔ camelCase contract.

Verification: pytest tests/ --ignore=tests/test_claude_client.py → 905 passed. jlpm tsc --noEmit clean. jlpm jest → 185 passed. jlpm lint:check clean.

Risks and follow-ups

Concurrent same-skill syncs (e.g. two browser tabs clicking the per-row Sync button simultaneously) race the rmtree/copytree. Sync-all is batched within a single handler so cross-skill is safe; only same-skill cross-handler is exposed. A per-skill lock on SkillManager is the principled fix; deferred to keep this PR focused.
SkillEditor state sprawl: pre-existing trend (now 22+ hooks) amplified by this PR's toggle. useReducer would be the right refactor; deferred.
No incremental progress feedback during sync-all: the UI shows "Syncing..." for the whole batch with no per-skill ticker. A future iteration could surface per-skill progress via streaming, but for the typical 1-5 tracking skills the current UX is fine.

Six-agent review pass (code-reuse, code-quality, efficiency, backend/contract, frontend/a11y, test architecture) ran on the four-commit base; the consolidated fix-before-merge items all landed in the final commit (5/5). Remaining items above are documented follow-ups with no immediate impact.

Two new optional frontmatter keys on the Skill dataclass: `tracks_upstream` (bool) marks a user-imported GitHub skill as sync-eligible, and `tracking_ref` stores the SHA of the last successful sync so the upcoming sync action can skip the tarball fetch when upstream hasn't moved. Distinct from the existing `managed_source` / `managed_ref` pair: managed skills are org-curated (read-only in the UI, can be auto-removed by the reconciler), while tracking skills are user-authored (editable, never auto-removed). The two pairs are mutually exclusive in practice but the schema treats them as independent flags so a future refactor that unifies them has a clear seam. Both fields default to falsy and `serialize_skill_md` omits them from frontmatter when not set, so an existing user-imported skill that never opted in produces identical on-disk content as before.

import_from_github() now accepts `tracks_upstream` so the import dialog can stamp the new frontmatter flag at install time. update_skill() accepts a `tracks_upstream` toggle so the detail editor can flip it on or off; toggling off removes the field entirely (a non-tracking skill is identical to a plain user-imported one). Tracking can never be enabled on a managed skill (mutually exclusive metadata) or on a skill with no recorded source URL. sync_tracking_skill(scope, name) is the new sync action: - probes the commits API for the latest SHA at the recorded source, - short-circuits when the SHA matches the recorded tracking_ref (no tarball fetch, no rmtree, no copytree), - otherwise stages the new bundle, then atomically replaces the existing one and rewrites SKILL.md with the new tracking_ref. Probe failure raises rather than re-downloading: the user clicked the button expecting a fresh check, and silently re-fetching on network errors would mask real outages. Tarball-fetch failure preserves the existing bundle on disk (rmtree only happens after staging succeeds). list_tracking_skills() mirrors list_managed_skills() so the upcoming UI toolbar action ("Sync all tracking") has a clean enum point.

Three handler-level changes wire the new manager methods to the frontend: - `POST /skills/{scope}/{name}/sync` invokes sync_tracking_skill via run_in_executor (the call does blocking HTTP). Returns {updated: bool, ref: str}. Gated by allow_github_skill_import: sync is the same network egress with the same trust boundary as the initial import, so admins who've disabled imports do not want sync silently keeping skills fresh. - `POST /skills/sync-all-tracking` iterates every tracking skill and returns a per-skill result array. Failures are isolated per skill; a single broken upstream doesn't stop the rest of the batch. - `PUT /skills/{scope}/{name}` now accepts an optional `tracks_upstream` field so the detail editor can flip the toggle. None preserves the current value; the manager rejects invalid combinations (managed + tracking, no source + tracking). `POST /skills/import` accepts a `tracks_upstream` flag so the import dialog can stamp the new frontmatter at install time. Route order: the new `{scope}/{name}/sync` registers before the generic `{scope}/{name}` catch-all so Tornado doesn't shadow the literal "sync" segment with the SKILL_NAME_REGEX path.

API surface (src/api.ts): - ISkillSummary / ISkillDetail gain `tracksUpstream` + `trackingRef`. - `importSkill({tracksUpstream})` lets the import dialog opt in. - `updateSkill({tracksUpstream})` lets the detail editor toggle it. - New `syncTrackingSkill(scope, name)` and `syncAllTrackingSkills()` bind to the per-skill and batch HTTP endpoints. Skills panel (src/components/skills-panel.tsx): - "Track upstream" checkbox in the GitHub import dialog. Off by default so existing flows are unchanged. - "Tracking" badge on rows whose skill carries `tracks_upstream`. Distinct color from the "Managed" badge (org-curated vs user-opted-in). - Per-row sync button (↻) on tracking skills, disabled when the admin policy `allow_github_skill_import` is off. - Panel toolbar "Sync tracking skills" button appears alongside "Sync managed skills" when any tracking skills are installed. - Editor toggle for tracks_upstream on existing skills that have a source URL and are not managed. Saves immediately on click via the PUT endpoint; the backend rejects invalid combinations (managed + tracking, no source + tracking). Single-source toggle off, never delete on failure, never delete if GitHub is unreachable, never delete if the upstream repo is gone. The bundle on disk is always preserved across any failure path.

Adds a new "Tracking upstream for user-imported skills" section to docs/skills.md. Covers the opt-in mechanic (import checkbox or post-import toggle), what Sync does and what it explicitly will not do (never delete on failure or upstream removal), mutual exclusion with managed status, and the admin-policy gating shared with imports.

Convergent and high-confidence findings from the parallel review pass. Backend (extension.py): - Map RuntimeError to 502 in SkillsBaseHandler.exception_status_map. The probe-failure path raises RuntimeError ("Could not probe GitHub..."); the default 400 fallback misled clients into treating upstream outages as malformed requests. - PUT /skills/{scope}/{name} that includes `tracks_upstream` now rejects with 403 when allow_github_skill_import is false. Without this, an admin's kill switch could silently leak: the toggle bit gets set while sync is the actual network egress that's blocked. - SkillsSyncAllTrackingHandler now runs per-skill syncs through an asyncio.Semaphore(3). Serial worst case for N tracking skills was N * 15s (commits-API timeout); parallel keeps the rate-limit burst civil at 60/hour unauth while collapsing 10-skill sync from ~150s to ~50s. Shared helper (skill_github_import.py): - Extracted `_SHA_RE` and a `resolve_desired_sha(ref_info, token)` helper. Both the reconciler's `_resolve_desired_sha` and the new `sync_tracking_skill` were inline-duplicating the "if 40-char SHA short-circuit else probe" dance. Frontend (skills-panel.tsx, base.css): - The `:disabled` state for `.nbi-icon-button` was visually identical to enabled; added `opacity: 0.4; cursor: not-allowed` so the sync button (and every other icon button when disabled) reads correctly. - The editor's tracking toggle now flips optimistically and rolls back the local state on PUT failure; the previous code left the UI showing the failed-target state. - Removed `role="group"` from the tracking row (no aria-label → screen reader announces "group" with no context). - Stripped an em dash from the import-dialog "Track upstream" label per repo convention. - Tracking badge `background-color` was using `--jp-accept-color3` which doesn't exist in JupyterLab's theme; fallback resolved to `--jp-layout-color2` (same as panel background, badge invisible). Switched to `--jp-info-color3`. Tests: - 4 new handler-layer PUT-tracking cases: managed-skill 400, no-source 400, blocked-when-github-import-disabled 403, and a regression pin that omitting `tracks_upstream` from the PUT body preserves the current value. - 1 new sync-handler case asserting probe failure maps to 502. - 1 new TS file (`tests/ts/skill-wire-format.test.ts`) pinning the snake_case ↔ camelCase boundary on `skillFromWire`. The most load-bearing wire contract in the panel; a typo here would silently corrupt "I toggled it on but it didn't stick." - `_patch_sha` test helpers now patch `skill_github_import.get_latest_commit_sha` at the definition site so the patch covers both the legacy direct callers and the new shared `resolve_desired_sha` wrapper. Deferred follow-ups noted in the review consensus but not landed: per-skill lock for concurrent same-skill syncs (low-probability race), `useReducer` refactor for SkillEditor (pre-existing state sprawl, not a regression), per-row React.memo (negligible at typical skill counts).

Promotes the [Unreleased] CHANGELOG snapshot to [5.0.0] - 2026-05-22 and expands it to cover everything merged into upstream/main after PR plmbr#287's docs refresh. Bumps package.json to 5.0.0. CHANGELOG additions cover the post-plmbr#287 surface: - Settings tabs: plugin marketplace picker (plmbr#284), plugin marketplace details + Update button (plmbr#303), per-workspace MCP disable (plmbr#286), JSON-paste path in Add MCP server (plmbr#285). - Launchers: hide-with-policy (plmbr#288), brand icons for Codex / opencode (plmbr#325, plmbr#333), per-launch directory picker (plmbr#332). - Chat sidebar and agentic UX: workspace @-mention in Claude mode (plmbr#327), reload-open-files-on-disk (plmbr#330), steered system prompt away from over-eager notebook creation (plmbr#336). - Skills: multi-manifest support (plmbr#321), tracks-upstream for user- imported skills (plmbr#322), HTTP kill switch for the reconciler (plmbr#291). - Accessibility: full sub-section covering plmbr#305-plmbr#320. - Security: shell-tool sandbox (plmbr#290), Claude UI-bridge sandbox (plmbr#323), 0o600 on encrypted token (plmbr#293), env-secret scrubbing (plmbr#295), MCP config shape validation (plmbr#299), XSS allowlist (plmbr#296), Copilot WS auth + origin (plmbr#301), GHE host detection (plmbr#292), fastmcp -> mcp SDK swap (plmbr#324). - Fixed: session listing unification (plmbr#310), session preview unwrap (plmbr#331), down-area runtime throw (plmbr#330 follow-up), WS message-handler leak (plmbr#294). - Removed: fastmcp dependency, history.jsonl session gate. Adds a Migration note covering the five behavior changes operators should review before upgrading from 4.x: fastmcp swap, path sandboxes, history.jsonl gate removal, workspace @-mention pointer shape, and the Copilot WebSocket auth/origin tightening. Two reviewer rounds (six personas each) applied: - Round 1 caught security overclaims (plmbr#293, plmbr#299, plmbr#323), the plmbr#284/plmbr#303 mis-attribution, missing migration note, 3 em dashes, and the stale `fastmcp==2.x.*` recommendation in the admin guide. - Round 2 caught the missing plmbr#301 migration bullet, missing version- matrix 5.0.x row, missing README TOC entry, and a couple of style nits (sub-heading overpromise, orphan bullet). Skipped (deferred to future PRs): - README first-run tour mention. - Admin guide HTTP kill-switch row in Failure-modes table. - Terminal drag-drop trust-model precision update after plmbr#327. - Cipher description nit in plmbr#293 (Fernet AES-128-CBC+HMAC, not AES-GCM).

pjdoland added 6 commits May 18, 2026 21:54

pjdoland added the enhancement New feature or request label May 19, 2026

mbektas approved these changes May 19, 2026

View reviewed changes

mbektas merged commit 5a4037a into plmbr:main May 19, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): track upstream for user-imported GitHub skills#322

feat(skills): track upstream for user-imported GitHub skills#322
mbektas merged 6 commits into
plmbr:mainfrom
pjdoland:feat/skills-track-upstream

pjdoland commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pjdoland commented May 19, 2026

Summary

Solution highlights

Mutual exclusion

Never-delete invariants

Testing

Risks and follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants