Skip to content

feat(skills): track upstream for user-imported GitHub skills#322

Merged
mbektas merged 6 commits into
plmbr:mainfrom
pjdoland:feat/skills-track-upstream
May 19, 2026
Merged

feat(skills): track upstream for user-imported GitHub skills#322
mbektas merged 6 commits into
plmbr:mainfrom
pjdoland:feat/skills-track-upstream

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

Summary

User-imported GitHub skills can now opt into tracking upstream. The Import-from-GitHub dialog has a new Track upstream checkbox; ticking it stamps the installed skill with a tracks_upstream frontmatter flag and lights up a per-skill Sync button (and a panel-level Sync tracking skills batch action).

Sync re-fetches the bundle from the recorded source URL and replaces the on-disk content. Distinct from the existing managed-skills reconciler: tracking is opt-in per skill, never auto-scheduled, never auto-removed. The bundle on disk is preserved across every failure path (probe failure, tarball fetch failure, upstream repo deletion); the user clicked Sync, they're shown the error, and the existing bundle stays put.

Solution highlights

  • Schema (skillset.py): two new optional frontmatter keys, tracks_upstream and tracking_ref. Default falsy and omitted from serialized SKILL.md when unset, so existing user-imported skills are unchanged on disk.
  • SkillManager.sync_tracking_skill(scope, name): probes the commits API for the latest SHA at the recorded source, short-circuits when the SHA matches tracking_ref (no tarball fetch, no rmtree). When the SHA differs, stages the new bundle, atomically replaces, stamps the new ref. Probe failure raises rather than silently re-downloading. Tarball-fetch failure preserves the existing bundle on disk (rmtree only happens after staging succeeds).
  • HTTP: POST /skills/{scope}/{name}/sync and POST /skills/sync-all-tracking route through run_in_executor. The batch endpoint runs per-skill syncs through an asyncio.Semaphore(3) so N skills don't take N * 15s in the worst case. Both gated by allow_github_skill_import (same network egress, same trust boundary). PUT /skills/{scope}/{name} accepts a tracks_upstream field for toggling on existing skills; that toggle is also gated by allow_github_skill_import so an admin's kill switch can't leak.
  • Shared SHA helper (skill_github_import.py): new resolve_desired_sha(ref_info, token) consolidates the "if 40-char SHA short-circuit else probe" dance that the managed reconciler and the new sync path both run.
  • Frontend: a "Tracking" badge (distinct from "Managed" by color and tooltip), a per-row sync button with disabled-state styling, a panel toolbar batch button, and an editor toggle for opting in/out post-import. The toggle flips optimistically and rolls back on PUT failure so the checkbox always reflects server state.

Mutual exclusion

A skill cannot be both managed and tracking. The manager refuses to enable tracking on a managed skill (ValueError → 400 at the HTTP layer). The two metadata pairs (managed_source/managed_ref vs tracks_upstream/tracking_ref) are independent on the wire but enforced exclusive at the manager.

Never-delete invariants

The "never delete" contract is enforced by the call ordering:

  • rmtree(skill.root_path) runs only after stage_skill_from_github succeeds (which is the only step that can throw on a network or 404 error).
  • A probe failure raises before rmtree is even reached.
  • The reconciler is not extended to touch tracking skills; only the manual sync action can mutate them.

Testing

  • tests/test_skill_manager.py: 12 new cases in TestTrackUpstream (import opt-in, toggle on/off via update_skill, sync happy path, sync skips on matching SHA, sync raises on probe failure, sync rejects non-tracking/managed/missing skills, sync preserves bundle on staging failure, list_tracking_skills filter) plus 3 cases in TestManagedFrontmatter for the schema fields.
  • tests/test_skills_handlers.py: 9 new cases (SkillSyncHandler happy path, unchanged short-circuit, missing-skill 404, non-tracking rejection, 403 when imports disabled, 502 on probe failure, plus 4 PUT-toggle cases for managed/no-source/imports-disabled rejection and the regression pin that omitting tracks_upstream preserves the current value) plus 3 cases on SkillsSyncAllTrackingHandler.
  • tests/ts/skill-wire-format.test.ts (new): 4 cases pinning the skillFromWire snake_case ↔ camelCase contract.

Verification: pytest tests/ --ignore=tests/test_claude_client.py → 905 passed. jlpm tsc --noEmit clean. jlpm jest → 185 passed. jlpm lint:check clean.

Risks and follow-ups

  • Concurrent same-skill syncs (e.g. two browser tabs clicking the per-row Sync button simultaneously) race the rmtree/copytree. Sync-all is batched within a single handler so cross-skill is safe; only same-skill cross-handler is exposed. A per-skill lock on SkillManager is the principled fix; deferred to keep this PR focused.
  • SkillEditor state sprawl: pre-existing trend (now 22+ hooks) amplified by this PR's toggle. useReducer would be the right refactor; deferred.
  • No incremental progress feedback during sync-all: the UI shows "Syncing..." for the whole batch with no per-skill ticker. A future iteration could surface per-skill progress via streaming, but for the typical 1-5 tracking skills the current UX is fine.

Six-agent review pass (code-reuse, code-quality, efficiency, backend/contract, frontend/a11y, test architecture) ran on the four-commit base; the consolidated fix-before-merge items all landed in the final commit (5/5). Remaining items above are documented follow-ups with no immediate impact.

pjdoland added 6 commits May 18, 2026 21:54
Two new optional frontmatter keys on the Skill dataclass:
`tracks_upstream` (bool) marks a user-imported GitHub skill as
sync-eligible, and `tracking_ref` stores the SHA of the last
successful sync so the upcoming sync action can skip the tarball
fetch when upstream hasn't moved.

Distinct from the existing `managed_source` / `managed_ref` pair:
managed skills are org-curated (read-only in the UI, can be
auto-removed by the reconciler), while tracking skills are
user-authored (editable, never auto-removed). The two pairs are
mutually exclusive in practice but the schema treats them as
independent flags so a future refactor that unifies them has a
clear seam.

Both fields default to falsy and `serialize_skill_md` omits them
from frontmatter when not set, so an existing user-imported skill
that never opted in produces identical on-disk content as before.
import_from_github() now accepts `tracks_upstream` so the import
dialog can stamp the new frontmatter flag at install time.
update_skill() accepts a `tracks_upstream` toggle so the detail
editor can flip it on or off; toggling off removes the field
entirely (a non-tracking skill is identical to a plain user-imported
one). Tracking can never be enabled on a managed skill (mutually
exclusive metadata) or on a skill with no recorded source URL.

sync_tracking_skill(scope, name) is the new sync action:
- probes the commits API for the latest SHA at the recorded source,
- short-circuits when the SHA matches the recorded tracking_ref
  (no tarball fetch, no rmtree, no copytree),
- otherwise stages the new bundle, then atomically replaces the
  existing one and rewrites SKILL.md with the new tracking_ref.

Probe failure raises rather than re-downloading: the user clicked
the button expecting a fresh check, and silently re-fetching on
network errors would mask real outages. Tarball-fetch failure
preserves the existing bundle on disk (rmtree only happens after
staging succeeds).

list_tracking_skills() mirrors list_managed_skills() so the
upcoming UI toolbar action ("Sync all tracking") has a clean enum
point.
Three handler-level changes wire the new manager methods to the
frontend:

- `POST /skills/{scope}/{name}/sync` invokes sync_tracking_skill via
  run_in_executor (the call does blocking HTTP). Returns
  {updated: bool, ref: str}. Gated by allow_github_skill_import:
  sync is the same network egress with the same trust boundary as
  the initial import, so admins who've disabled imports do not want
  sync silently keeping skills fresh.
- `POST /skills/sync-all-tracking` iterates every tracking skill and
  returns a per-skill result array. Failures are isolated per skill;
  a single broken upstream doesn't stop the rest of the batch.
- `PUT /skills/{scope}/{name}` now accepts an optional
  `tracks_upstream` field so the detail editor can flip the toggle.
  None preserves the current value; the manager rejects invalid
  combinations (managed + tracking, no source + tracking).

`POST /skills/import` accepts a `tracks_upstream` flag so the import
dialog can stamp the new frontmatter at install time.

Route order: the new `{scope}/{name}/sync` registers before the
generic `{scope}/{name}` catch-all so Tornado doesn't shadow the
literal "sync" segment with the SKILL_NAME_REGEX path.
API surface (src/api.ts):
- ISkillSummary / ISkillDetail gain `tracksUpstream` + `trackingRef`.
- `importSkill({tracksUpstream})` lets the import dialog opt in.
- `updateSkill({tracksUpstream})` lets the detail editor toggle it.
- New `syncTrackingSkill(scope, name)` and `syncAllTrackingSkills()`
  bind to the per-skill and batch HTTP endpoints.

Skills panel (src/components/skills-panel.tsx):
- "Track upstream" checkbox in the GitHub import dialog. Off by
  default so existing flows are unchanged.
- "Tracking" badge on rows whose skill carries `tracks_upstream`.
  Distinct color from the "Managed" badge (org-curated vs
  user-opted-in).
- Per-row sync button (↻) on tracking skills, disabled when the
  admin policy `allow_github_skill_import` is off.
- Panel toolbar "Sync tracking skills" button appears alongside
  "Sync managed skills" when any tracking skills are installed.
- Editor toggle for tracks_upstream on existing skills that have a
  source URL and are not managed. Saves immediately on click via
  the PUT endpoint; the backend rejects invalid combinations
  (managed + tracking, no source + tracking).

Single-source toggle off, never delete on failure, never delete if
GitHub is unreachable, never delete if the upstream repo is gone.
The bundle on disk is always preserved across any failure path.
Adds a new "Tracking upstream for user-imported skills" section to
docs/skills.md. Covers the opt-in mechanic (import checkbox or
post-import toggle), what Sync does and what it explicitly will
not do (never delete on failure or upstream removal), mutual
exclusion with managed status, and the admin-policy gating shared
with imports.
Convergent and high-confidence findings from the parallel review pass.

Backend (extension.py):
- Map RuntimeError to 502 in SkillsBaseHandler.exception_status_map.
  The probe-failure path raises RuntimeError ("Could not probe
  GitHub..."); the default 400 fallback misled clients into treating
  upstream outages as malformed requests.
- PUT /skills/{scope}/{name} that includes `tracks_upstream` now
  rejects with 403 when allow_github_skill_import is false. Without
  this, an admin's kill switch could silently leak: the toggle bit
  gets set while sync is the actual network egress that's blocked.
- SkillsSyncAllTrackingHandler now runs per-skill syncs through an
  asyncio.Semaphore(3). Serial worst case for N tracking skills was
  N * 15s (commits-API timeout); parallel keeps the rate-limit
  burst civil at 60/hour unauth while collapsing 10-skill sync from
  ~150s to ~50s.

Shared helper (skill_github_import.py):
- Extracted `_SHA_RE` and a `resolve_desired_sha(ref_info, token)`
  helper. Both the reconciler's `_resolve_desired_sha` and the new
  `sync_tracking_skill` were inline-duplicating the "if 40-char SHA
  short-circuit else probe" dance.

Frontend (skills-panel.tsx, base.css):
- The `:disabled` state for `.nbi-icon-button` was visually
  identical to enabled; added `opacity: 0.4; cursor: not-allowed`
  so the sync button (and every other icon button when disabled)
  reads correctly.
- The editor's tracking toggle now flips optimistically and rolls
  back the local state on PUT failure; the previous code left the
  UI showing the failed-target state.
- Removed `role="group"` from the tracking row (no aria-label →
  screen reader announces "group" with no context).
- Stripped an em dash from the import-dialog "Track upstream"
  label per repo convention.
- Tracking badge `background-color` was using `--jp-accept-color3`
  which doesn't exist in JupyterLab's theme; fallback resolved to
  `--jp-layout-color2` (same as panel background, badge invisible).
  Switched to `--jp-info-color3`.

Tests:
- 4 new handler-layer PUT-tracking cases: managed-skill 400,
  no-source 400, blocked-when-github-import-disabled 403, and a
  regression pin that omitting `tracks_upstream` from the PUT body
  preserves the current value.
- 1 new sync-handler case asserting probe failure maps to 502.
- 1 new TS file (`tests/ts/skill-wire-format.test.ts`) pinning the
  snake_case ↔ camelCase boundary on `skillFromWire`. The most
  load-bearing wire contract in the panel; a typo here would
  silently corrupt "I toggled it on but it didn't stick."
- `_patch_sha` test helpers now patch
  `skill_github_import.get_latest_commit_sha` at the definition
  site so the patch covers both the legacy direct callers and the
  new shared `resolve_desired_sha` wrapper.

Deferred follow-ups noted in the review consensus but not landed:
per-skill lock for concurrent same-skill syncs (low-probability
race), `useReducer` refactor for SkillEditor (pre-existing state
sprawl, not a regression), per-row React.memo (negligible at
typical skill counts).
@pjdoland pjdoland added the enhancement New feature or request label May 19, 2026
@mbektas mbektas merged commit 5a4037a into plmbr:main May 19, 2026
4 of 5 checks passed
pjdoland added a commit to pjdoland/notebook-intelligence that referenced this pull request May 22, 2026
Promotes the [Unreleased] CHANGELOG snapshot to [5.0.0] - 2026-05-22
and expands it to cover everything merged into upstream/main after
PR plmbr#287's docs refresh. Bumps package.json to 5.0.0.

CHANGELOG additions cover the post-plmbr#287 surface:

- Settings tabs: plugin marketplace picker (plmbr#284), plugin marketplace
  details + Update button (plmbr#303), per-workspace MCP disable (plmbr#286),
  JSON-paste path in Add MCP server (plmbr#285).
- Launchers: hide-with-policy (plmbr#288), brand icons for Codex / opencode
  (plmbr#325, plmbr#333), per-launch directory picker (plmbr#332).
- Chat sidebar and agentic UX: workspace @-mention in Claude mode
  (plmbr#327), reload-open-files-on-disk (plmbr#330), steered system prompt
  away from over-eager notebook creation (plmbr#336).
- Skills: multi-manifest support (plmbr#321), tracks-upstream for user-
  imported skills (plmbr#322), HTTP kill switch for the reconciler (plmbr#291).
- Accessibility: full sub-section covering plmbr#305-plmbr#320.
- Security: shell-tool sandbox (plmbr#290), Claude UI-bridge sandbox (plmbr#323),
  0o600 on encrypted token (plmbr#293), env-secret scrubbing (plmbr#295), MCP
  config shape validation (plmbr#299), XSS allowlist (plmbr#296), Copilot WS
  auth + origin (plmbr#301), GHE host detection (plmbr#292), fastmcp -> mcp SDK
  swap (plmbr#324).
- Fixed: session listing unification (plmbr#310), session preview unwrap
  (plmbr#331), down-area runtime throw (plmbr#330 follow-up), WS message-handler
  leak (plmbr#294).
- Removed: fastmcp dependency, history.jsonl session gate.

Adds a Migration note covering the five behavior changes operators
should review before upgrading from 4.x: fastmcp swap, path
sandboxes, history.jsonl gate removal, workspace @-mention pointer
shape, and the Copilot WebSocket auth/origin tightening.

Two reviewer rounds (six personas each) applied:
- Round 1 caught security overclaims (plmbr#293, plmbr#299, plmbr#323), the
  plmbr#284/plmbr#303 mis-attribution, missing migration note, 3 em dashes,
  and the stale `fastmcp==2.x.*` recommendation in the admin guide.
- Round 2 caught the missing plmbr#301 migration bullet, missing version-
  matrix 5.0.x row, missing README TOC entry, and a couple of style
  nits (sub-heading overpromise, orphan bullet).

Skipped (deferred to future PRs):
- README first-run tour mention.
- Admin guide HTTP kill-switch row in Failure-modes table.
- Terminal drag-drop trust-model precision update after plmbr#327.
- Cipher description nit in plmbr#293 (Fernet AES-128-CBC+HMAC, not
  AES-GCM).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants