Skip to content

security: detect GitHub Enterprise hosts for marketplace add#292

Merged
mbektas merged 3 commits into
plmbr:mainfrom
pjdoland:fix/recognize-ghe-marketplace
May 18, 2026
Merged

security: detect GitHub Enterprise hosts for marketplace add#292
mbektas merged 3 commits into
plmbr:mainfrom
pjdoland:fix/recognize-ghe-marketplace

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

@pjdoland pjdoland commented May 18, 2026

Summary

The plugin-marketplace source detector recognized public github.com only. A GHE marketplace URL like https://github.acme.com/owner/repo silently:

  1. Bypassed allow_github_plugin_import = False. An admin who flipped the policy gate expected GHE adds to be rejected; the detector returned False so they were allowed.
  2. Fell through the GitHub-token injection chain. GITHUB_TOKEN / gh auth token resolution only fires when the detector says "this is GitHub." GHE adds went to anonymous git auth and failed with opaque could not read username errors against private repos.

The PR adds a new admin-declarable list of GHE hostnames so both surfaces work for self-hosted GitHub Enterprise.

Solution

New env NBI_GITHUB_ENTERPRISE_HOSTS (CSV) declares the additional hostnames the detector treats as GitHub. Cookie-domain shape:

  • bare token (github.acme.com) matches the exact host only
  • leading-dot token (.acme.com) matches every subdomain of acme.com

Subdomain matching is opt-in by design. If suffix-matching were the default, declaring acme.com would silently ship GITHUB_TOKEN into the env of any marketplace add against jira.acme.com, artifactory.acme.com, or any other corp service on the same domain. Admins who actually want the broad behavior must spell it out with the leading dot.

The change also refactors is_github_marketplace_source to route both URL and SCP-shorthand branches through a single _is_github_host helper, and fixes a pre-existing oddity: git@anyhost:o/r was incidentally classified as bare owner/repo shorthand because the regex saw "one slash, non-local prefix" and let it through. That would have landed the same token-injection bypass for any unknown SCP host. The shorthand branch now rejects strings containing @ or :.

Second-pass review added two host-detection edges:

  • _normalize_host strips a trailing dot before matching. urllib.parse.urlparse preserves the trailing dot on FQDNs (github.com.), so the exact-equality matcher would otherwise silently bypass both the public-github.com recognizer and the GHE-token matcher.
  • The URL branch refuses URLs whose userinfo carries a non-git username or any password before treating the extracted host as authoritative. https://attacker@github.com/owner/repo resolves to hostname github.com via urlparse but is a lookalike-attack shape. ssh://git@github.com/... (the legitimate path) is preserved.

Testing

Pytest cases cover:

  • URL forms: HTTPS, git://, ssh://, case-insensitive
  • SCP forms: git@github.acme.com:owner/repo, case-insensitive, lookalike rejection
  • Exact-match-rejects-subdomain-by-default contract
  • Leading-dot opt-in to subdomain matching, including "bare apex NOT included"
  • CSV with multiple tokens, mixing exact + subdomain shapes
  • Lookalike rejection on both URL and SCP shapes (e.g., github.acme.com.evil.test)
  • Empty env preserves backward-compatible default
  • Trailing-dot FQDN matches for both public-github and GHE shapes
  • Embedded-userinfo URL rejection (with ssh://git@... regression guard)
  • End-to-end: add_marketplace with GHE URL + allow_github=False raises PermissionError
  • End-to-end: add_marketplace with GHE URL injects GITHUB_TOKEN into the CLI env

Full suite green: pytest 792 pass, tsc clean, prettier clean, jest clean.

Risks / follow-ups

  • Pre-existing IDN homograph (out of scope here): the detector does no IDNA normalization. A user pasting a visually-spoofed Unicode URL would be rejected by the byte-literal matcher anyway, so the risk is misleading-error rather than silent-bypass. Worth a future doc note if needed.
  • The new env is independent of NBI_GHE_SUBDOMAIN (which only configures Copilot OAuth). Doc explicitly calls this out so admins don't assume one covers the other.

`is_github_marketplace_source` recognized public github.com only. A GHE
marketplace URL silently bypassed `allow_github_plugin_import = False`
AND fell through the GITHUB_TOKEN / gh-auth-token injection chain to
anonymous git auth. Both surfaces are security-relevant: a regulated
tenant that flipped the policy gate to force-off would still see the
GHE add succeed, and a private GHE repo would refuse the anonymous
fetch with an opaque error.

New env `NBI_GITHUB_ENTERPRISE_HOSTS` (CSV) declares the hostnames the
detector should treat as GitHub. Cookie-domain shape:

  - bare token      (`github.acme.com`)  matches the exact host only
  - leading-dot     (`.acme.com`)        matches every subdomain

Subdomain matching is opt-in deliberately. Broad declaration like a
bare `acme.com` apex would silently inject GITHUB_TOKEN into the env of
any marketplace add against `jira.acme.com`, `artifactory.acme.com`,
etc.; admins who actually want that behavior must spell it out with
the leading dot.

The change also refactors `is_github_marketplace_source` to route both
URL and SCP branches through a single `_is_github_host` helper, and
fixes a pre-existing oddity where strings like `git@anyhost:o/r` were
incidentally classified as bare `owner/repo` shorthand (one `/`,
non-local prefix) and would land the same token-injection bypass for
unknown SCP hosts. The shorthand branch now rejects strings containing
`@` or `:`.

Tests cover URL and SCP branches, case-insensitive matching, multi-host
CSV, mixed exact/subdomain tokens, lookalike rejection on both URL and
SCP shapes, exact-match-rejects-subdomain-by-default contract, and
end-to-end policy gate + token injection assertions on `add_marketplace`
with GHE URLs.
@pjdoland pjdoland added the bug Something isn't working label May 18, 2026
pjdoland added 2 commits May 18, 2026 06:52
…shapes

Second-pass review surfaced two GHE-detection edges that the first
round missed:

1. urllib.parse.urlparse preserves the trailing dot on FQDNs
   (https://github.com.), and the matchers used exact ==, so a
   trailing-dot FQDN silently bypassed both the public-github.com
   recognizer and the GHE-token matcher. Normalize the host through
   _normalize_host (lower + rstrip('.')) at the _is_github_host entry
   so every downstream comparison sees the canonical form.

2. Defense-in-depth on the URL branch: a source like
   https://attacker@github.com/owner/repo resolves to hostname
   github.com via urlparse, which the matcher then treats as
   authoritative. git/claude downstream would normalize the URL and
   refuse, but the policy gate should not depend on that. Reject any
   URL whose userinfo carries a non-`git` username or any password
   before treating the extracted host as authoritative.

New parametrized tests pin both the trailing-dot acceptance and the
embedded-userinfo rejection (including the ssh://git@... regression
guard for the legitimate path). One docs reword fixes a sentence that
read as if a current rule referenced a hypothetical configuration.
…arketplace

# Conflicts:
#	docs/admin-guide.md
@mbektas mbektas merged commit 51ceda2 into plmbr:main May 18, 2026
5 checks passed
pjdoland added a commit to pjdoland/notebook-intelligence that referenced this pull request May 22, 2026
Promotes the [Unreleased] CHANGELOG snapshot to [5.0.0] - 2026-05-22
and expands it to cover everything merged into upstream/main after
PR plmbr#287's docs refresh. Bumps package.json to 5.0.0.

CHANGELOG additions cover the post-plmbr#287 surface:

- Settings tabs: plugin marketplace picker (plmbr#284), plugin marketplace
  details + Update button (plmbr#303), per-workspace MCP disable (plmbr#286),
  JSON-paste path in Add MCP server (plmbr#285).
- Launchers: hide-with-policy (plmbr#288), brand icons for Codex / opencode
  (plmbr#325, plmbr#333), per-launch directory picker (plmbr#332).
- Chat sidebar and agentic UX: workspace @-mention in Claude mode
  (plmbr#327), reload-open-files-on-disk (plmbr#330), steered system prompt
  away from over-eager notebook creation (plmbr#336).
- Skills: multi-manifest support (plmbr#321), tracks-upstream for user-
  imported skills (plmbr#322), HTTP kill switch for the reconciler (plmbr#291).
- Accessibility: full sub-section covering plmbr#305-plmbr#320.
- Security: shell-tool sandbox (plmbr#290), Claude UI-bridge sandbox (plmbr#323),
  0o600 on encrypted token (plmbr#293), env-secret scrubbing (plmbr#295), MCP
  config shape validation (plmbr#299), XSS allowlist (plmbr#296), Copilot WS
  auth + origin (plmbr#301), GHE host detection (plmbr#292), fastmcp -> mcp SDK
  swap (plmbr#324).
- Fixed: session listing unification (plmbr#310), session preview unwrap
  (plmbr#331), down-area runtime throw (plmbr#330 follow-up), WS message-handler
  leak (plmbr#294).
- Removed: fastmcp dependency, history.jsonl session gate.

Adds a Migration note covering the five behavior changes operators
should review before upgrading from 4.x: fastmcp swap, path
sandboxes, history.jsonl gate removal, workspace @-mention pointer
shape, and the Copilot WebSocket auth/origin tightening.

Two reviewer rounds (six personas each) applied:
- Round 1 caught security overclaims (plmbr#293, plmbr#299, plmbr#323), the
  plmbr#284/plmbr#303 mis-attribution, missing migration note, 3 em dashes,
  and the stale `fastmcp==2.x.*` recommendation in the admin guide.
- Round 2 caught the missing plmbr#301 migration bullet, missing version-
  matrix 5.0.x row, missing README TOC entry, and a couple of style
  nits (sub-heading overpromise, orphan bullet).

Skipped (deferred to future PRs):
- README first-run tour mention.
- Admin guide HTTP kill-switch row in Failure-modes table.
- Terminal drag-drop trust-model precision update after plmbr#327.
- Cipher description nit in plmbr#293 (Fernet AES-128-CBC+HMAC, not
  AES-GCM).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants