Skip to content

security(ws): authenticate and origin-check Copilot WebSocket upgrades#301

Merged
mbektas merged 2 commits into
plmbr:mainfrom
pjdoland:fix/ws-handler-auth-xsrf
May 19, 2026
Merged

security(ws): authenticate and origin-check Copilot WebSocket upgrades#301
mbektas merged 2 commits into
plmbr:mainfrom
pjdoland:fix/ws-handler-auth-xsrf

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

@pjdoland pjdoland commented May 18, 2026

Summary

WebsocketCopilotHandler subclassed raw tornado.websocket.WebSocketHandler with an empty open() body: no @authenticated, no explicit check_origin, no check_xsrf_cookie, no audit log. The handler drives the chat tool-execution pipeline (run_command_in_embedded_terminal, file edits, notebook edits, MCP tool calls), so it is the highest-privilege endpoint in the extension. Tornado's WebSocketHandler default check_origin enforces same-host only, which leaves the endpoint reachable from any cross-origin tab the browser allows to send WS upgrades, and from any in-network process on a multi-tenant Jupyter Server.

Solution

Adopt Jupyter's first-party WebSocket pattern, matching KernelWebsocketHandler and friends:

class WebsocketCopilotHandler(WebSocketMixin, websocket.WebSocketHandler, JupyterHandler):
    @ws_authenticated
    def open(self):
        log.info("Copilot WS upgrade accepted user=%r origin=%r", ...)
  • WebSocketMixin (jupyter_server.base.websocket) contributes the upgrade-aware check_origin that respects the server's allow_origin / allow_origin_pat traitlets, plus the ping/pong keepalive Jupyter expects on every WS handler. The default WebSocketHandler check_origin (any same-host origin) is shadowed.
  • JupyterHandler contributes identity-provider integration (current_user, check_xsrf_cookie) so cookies and tokens are resolved against the same machinery as the REST routes.
  • ws_authenticated (jupyter_server.auth.decorator) raises HTTPError(403) on unauthenticated upgrades. tornado.web.authenticated would redirect(login_url) which is meaningless on a WS upgrade.

Accepted upgrades log the resolved user identity and origin so an incident can be correlated with the negotiated identity (Jupyter's identity provider returns a User dataclass; the getattr(..., "username", ...) fallback handles custom providers that return a string).

Testing

  • tests/test_ws_handler_auth.py: pins inheritance from both WebSocketMixin and JupyterHandler; MRO ordering so WebSocketMixin.check_origin wins and JupyterHandler.check_xsrf_cookie provides the upgrade-aware XSRF check; behavioral assertion that open() raises HTTPError(403) when current_user is falsy; and a caplog assertion that an accepted upgrade emits the audit line with the resolved user identity. A regression that removes the decorator, reorders the bases, or drops the audit line fails one of these.
  • Test scaffolds in tests/test_image_context.py and tests/test_websocket_handler_integration.py updated: the mocked Application now exposes the settings / transforms / ui_methods / ui_modules keys that JupyterHandler.set_default_headers walks during __init__.
  • Full suites: pytest 779 pass, tsc clean, prettier clean, jest clean.

Risks / follow-ups

  • Behavior change: cross-origin browser tabs that previously could open the WS without an allow_origin entry will see a 403. Same-origin (the labextension's normal flow) is unaffected. Operators running with --ServerApp.allow_origin='*' are unchanged: the WS upgrade now follows the same allow-list logic the REST endpoints already follow.
  • No-auth dev mode: --ServerApp.token='' continues to work because Jupyter's identity provider supplies an anonymous User for which current_user is truthy.
  • JupyterHub trusted-proxy: the inherited WebSocketMixin.check_origin checks skip_check_origin() which the proxy-trust path sets, so existing Hub deployments are unaffected.
  • Audit log on reject: 403s are logged by Jupyter at WARNING already; a richer log_exception override is tracked separately if operators ask for more context (User-Agent, remote_ip, route) on rejected upgrades.
  • Subprotocol negotiation: the handler doesn't override select_subprotocol. The default (echo none, ignore Sec-WebSocket-Protocol headers) is fine for the chat payload but worth a defensive no-op in a future hardening pass.

The admin guide HTTP-API surface section is updated to call out the new posture so an operator reading the docs knows the chat WS endpoint inherits the same gates as the REST handlers.

WebsocketCopilotHandler subclassed raw tornado.websocket.WebSocketHandler
with an empty open() body: no @authenticated, no explicit check_origin,
no check_xsrf_cookie, no audit log. The handler drives the chat
tool-execution pipeline (run_command_in_embedded_terminal, file edits,
notebook edits, MCP tool calls), so it is the highest-privilege endpoint
in the extension. Tornado's WebSocketHandler default check_origin
enforces same-host only, which leaves the endpoint reachable from any
cross-origin tab that the browser allows to send WS upgrades, and from
any in-network process on a multi-tenant Jupyter Server.

Adopt Jupyter's first-party WebSocket pattern: subclass
``WebSocketMixin``, ``websocket.WebSocketHandler``, and ``JupyterHandler``
(in that order). WebSocketMixin contributes the upgrade-aware
``check_origin`` that respects ``allow_origin`` / ``allow_origin_pat``
and the ping/pong keepalive Jupyter expects on its WS handlers.
JupyterHandler contributes identity-provider integration and
``check_xsrf_cookie``. ``open()`` is now decorated with
``jupyter_server.auth.decorator.ws_authenticated``, which returns 403
on unauthenticated upgrades rather than ``tornado.web.authenticated``'s
redirect-to-login behavior that is meaningless on a WS upgrade. Accepted
upgrades log the resolved user identity and origin for incident audit.

Test scaffolds updated: the mocked Application now exposes the
``settings``, ``transforms``, ``ui_methods``, ``ui_modules`` keys
JupyterHandler.set_default_headers walks during ``__init__``. New
tests in tests/test_ws_handler_auth.py pin the inheritance, the MRO
ordering so the hardened check_origin wins, and behaviorally that
``open()`` raises HTTPError(403) when current_user is falsy.

Admin guide updated to clarify that the Copilot WS endpoint now goes
through the same allow_origin and identity-provider gates as the
REST handlers.
@pjdoland pjdoland added the bug Something isn't working label May 18, 2026
@pjdoland pjdoland changed the title fix(ws): authenticate and origin-check Copilot WebSocket upgrades security(ws): authenticate and origin-check Copilot WebSocket upgrades May 18, 2026
Second-pass review surfaced a clean merge conflict where main added
the per-message callback-cleanup helper (PR plmbr#294's _run_request_thread)
adjacent to where this PR decorated open() with @ws_authenticated.
Resolve by keeping _run_request_thread first and the decorator
immediately above open(). The two are orthogonal and compose cleanly.

Add two tests the first round missed:

- A symmetric MRO assertion that check_xsrf_cookie resolves to
  JupyterHandler (the upgrade-aware override that skips XSRF only
  when origin + auth are already satisfied). The matching
  check_origin assertion was already pinned to WebSocketMixin.

- A caplog assertion that the accepted-upgrade audit line fires with
  the resolved user identity and the Origin header. Incident triage
  relies on that log; a future refactor that drops it would otherwise
  ship unnoticed.
@mbektas mbektas merged commit 8cfaa5c into plmbr:main May 19, 2026
5 checks passed
pjdoland added a commit to pjdoland/notebook-intelligence that referenced this pull request May 22, 2026
Promotes the [Unreleased] CHANGELOG snapshot to [5.0.0] - 2026-05-22
and expands it to cover everything merged into upstream/main after
PR plmbr#287's docs refresh. Bumps package.json to 5.0.0.

CHANGELOG additions cover the post-plmbr#287 surface:

- Settings tabs: plugin marketplace picker (plmbr#284), plugin marketplace
  details + Update button (plmbr#303), per-workspace MCP disable (plmbr#286),
  JSON-paste path in Add MCP server (plmbr#285).
- Launchers: hide-with-policy (plmbr#288), brand icons for Codex / opencode
  (plmbr#325, plmbr#333), per-launch directory picker (plmbr#332).
- Chat sidebar and agentic UX: workspace @-mention in Claude mode
  (plmbr#327), reload-open-files-on-disk (plmbr#330), steered system prompt
  away from over-eager notebook creation (plmbr#336).
- Skills: multi-manifest support (plmbr#321), tracks-upstream for user-
  imported skills (plmbr#322), HTTP kill switch for the reconciler (plmbr#291).
- Accessibility: full sub-section covering plmbr#305-plmbr#320.
- Security: shell-tool sandbox (plmbr#290), Claude UI-bridge sandbox (plmbr#323),
  0o600 on encrypted token (plmbr#293), env-secret scrubbing (plmbr#295), MCP
  config shape validation (plmbr#299), XSS allowlist (plmbr#296), Copilot WS
  auth + origin (plmbr#301), GHE host detection (plmbr#292), fastmcp -> mcp SDK
  swap (plmbr#324).
- Fixed: session listing unification (plmbr#310), session preview unwrap
  (plmbr#331), down-area runtime throw (plmbr#330 follow-up), WS message-handler
  leak (plmbr#294).
- Removed: fastmcp dependency, history.jsonl session gate.

Adds a Migration note covering the five behavior changes operators
should review before upgrading from 4.x: fastmcp swap, path
sandboxes, history.jsonl gate removal, workspace @-mention pointer
shape, and the Copilot WebSocket auth/origin tightening.

Two reviewer rounds (six personas each) applied:
- Round 1 caught security overclaims (plmbr#293, plmbr#299, plmbr#323), the
  plmbr#284/plmbr#303 mis-attribution, missing migration note, 3 em dashes,
  and the stale `fastmcp==2.x.*` recommendation in the admin guide.
- Round 2 caught the missing plmbr#301 migration bullet, missing version-
  matrix 5.0.x row, missing README TOC entry, and a couple of style
  nits (sub-heading overpromise, orphan bullet).

Skipped (deferred to future PRs):
- README first-run tour mention.
- Admin guide HTTP kill-switch row in Failure-modes table.
- Terminal drag-drop trust-model precision update after plmbr#327.
- Cipher description nit in plmbr#293 (Fernet AES-128-CBC+HMAC, not
  AES-GCM).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants