Skip to content

feat(terminal): drag-drop file attach with @-mention / raw modes#256

Merged
mbektas merged 8 commits into
plmbr:mainfrom
pjdoland:feat/248-terminal-drag-drop
May 16, 2026
Merged

feat(terminal): drag-drop file attach with @-mention / raw modes#256
mbektas merged 8 commits into
plmbr:mainfrom
pjdoland:feat/248-terminal-drag-drop

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

@pjdoland pjdoland commented May 14, 2026

Summary

Drag a file from your desktop onto a JupyterLab terminal, and the terminal injects @<path> (Claude Code's @-mention syntax) at the cursor. The browser's security model doesn't expose OS paths to web pages, so the browser uploads the dropped file to a server-side staging directory first; the server returns the absolute path, and the terminal pastes that. Drags from the JupyterLab file browser work too (no upload, the path is already known).

Solution

The chat sidebar already implements file-drop attachment with the exact server plumbing this feature needs: NBIAPI.uploadFile(file) POSTs multipart/form-data to FileUploadHandler, which writes the file to a per-upload tempdir and returns { serverPath, filename }. This PR adds a small src/terminal-drag.ts module that subscribes to ITerminalTracker.widgetAdded, attaches capture-phase dragover/drop (and Lumino lm-drop) handlers on each terminal, and injects via terminal.content.paste(...) so the PTY treats the text the same as a real paste (preserving bracketed-paste framing that Claude Code relies on for multi-token @<path> parses).

Per-terminal toolbar toggle switches the default mode (@-mention vs shell-escaped raw path). Holding Shift while dropping inverts mode for one drop.

While the server side is shared with chat uploads, this PR also adds three knobs that benefit both surfaces:

  • NBI_TERMINAL_DRAG_DROP_POLICY (force-on / force-off / user-choice): standard admin policy triad. force-off suppresses the terminal listener so files dragged onto a terminal fall through to the browser's default behavior. Useful for hardened deployments that don't want files staged through the upload endpoint.
  • NBI_UPLOAD_MAX_MB (default 50): per-file size cap. Requests over the cap return HTTP 413; the frontend surfaces the rejection via a Notification toast so screen-reader users see the same signal as sighted users.
  • NBI_UPLOAD_RETENTION_HOURS (default 24): lazy retention sweep on the staging directory. Rate-limited to once per 60s so a burst of parallel uploads doesn't pay the listdir + stat cost on every request.

Two micro-architecture notes:

  • Pure helpers (formatForMode, invertMode) live in src/terminal-drag-format.ts so jest can exercise them without pulling Lumino/DOM imports. The same logic applies to shellSingleQuote, which moves from src/utils.ts (loads tiktoken at module import) to a tiny src/shell-utils.ts; utils.ts re-exports for backward compat.
  • isEnabled() reads the policy off the raw NBIAPI.config.capabilities.feature_policies.terminal_drag_drop object rather than the featurePolicies getter, because dragover fires at ~60 Hz and the getter rebuilds an 11-key object per call.

Testing

Automated:

  • tests/test_file_upload.py: size cap 413 path + accepts under cap + 0 disables cap; retention sweep removes stale subdirs while preserving fresh ones, runs after a successful upload, throttles within the 60s window, handles missing root and non-directory entries.
  • tests/ts/terminal-drag.test.ts: formatForMode (mention vs raw, multi-file space-joined, empty input, shell-escape edge cases), invertMode (Shift-inversion semantics).
  • Full pytest sweep (522 passing), jlpm tsc --noEmit, jlpm lint:check, jlpm jest (111 passing) all green.

Manual verification:

  • Drag a file from Finder onto a Claude Code terminal: confirm @</tmp/...> is injected.
  • Toggle the toolbar button: confirm subsequent drops shell-escape the path.
  • Hold Shift while dropping: confirm the per-drop mode flips.
  • Drop a file over the 50 MB cap: confirm a Notification toast appears with the filename and reason.
  • Set NBI_TERMINAL_DRAG_DROP_POLICY=force-off: confirm drops fall through to the browser default.
  • Drag a file from the JupyterLab file browser: confirm the relative path is injected without re-upload.

Risks and follow-ups

  • The ITerminalTracker token is cast to Token<unknown> because @jupyterlab/terminal@4.5.7 ships its own nested @lumino/coreutils, making the nominal Token class distinct from our top-level copy. Runtime is unaffected (Lumino looks up tokens by reference identity against the host JupyterLab's own registry), but the type-system contortion is real. Cleanest follow-up is a yarn resolutions block; out of scope for this PR.
  • Structural types in src/terminal-drag.ts (IMainAreaWidgetLike, ITerminalTrackerLike, etc.) capture only what the module touches. Same nested-version-skew workaround. Will simplify once the resolutions follow-up lands.
  • The size cap runs after Tornado has parsed the multipart body into memory, so the 100 MB default max_body_size is still the worst-case ceiling. Streamed uploads with set_max_body_size would tighten that further; out of scope here.
  • The terminal_drag_drop policy currently maps "user-choice" and "force-on" to the same enabled state because there's no user-facing toggle. If a future Settings panel adds one, swap the literal True in _build_feature_policies_response for the user value and add a locked_keys entry in ConfigHandler.
  • No keyboard fallback for drag-drop. A "Browse files" toolbar button alongside the mode toggle is a natural follow-up.

Closes #248.

@pjdoland pjdoland added the enhancement New feature or request label May 14, 2026
pjdoland added a commit to pjdoland/notebook-intelligence that referenced this pull request May 14, 2026
Five fix-before-merge findings from the review pass:

1. Stale-closure dedupe in attachWorkspacePaths. The drop listener
   captured `selectedContextFiles` at first render, so repeated drops
   of the same path landed duplicate context pills. Move dedupe inside
   the setSelectedContextFiles functional updater so it reads the
   freshest state.

2. Inconsistent error surfacing on chat drop. The terminal path uses
   Notification.warning toasts; the chat path was emitting a
   chat-message-notice that scrolled out of view. Switch to
   Notification.warning to match.

3. Missing keyboard focus after chat drop. Users had to click into the
   prompt textarea before typing. Call promptInputRef.current.focus()
   after a successful attach.

4. Backend image branch only matched `is_image AND is_upload`. File-
   browser image drags set `isImage: true` but not `source: 'upload'`,
   so OpenAI-vision providers received a text breadcrumb without image
   bytes. Drop the is_upload constraint so the workspace image path
   reaches the vision payload too.

5. Path-traversal sandbox on workspace-relative context paths. Backend
   was joining `root_dir + filePath` without normalization; ``..`` and
   absolute paths leaked through. Resolve via realpath + commonpath
   and reject anything that escapes root_dir.

Regression coverage: test_workspace_image_drag_reaches_vision_provider
pins the non-upload image branch; test_path_traversal_outside_workspace
and test_absolute_path_outside_workspace pin the sandbox.
@pjdoland
Copy link
Copy Markdown
Collaborator Author

Small additional change just pushed on this branch: terminal Shift+Enter now sends the ESC+CR "meta-enter" byte sequence to the PTY, so Claude Code's prompt input treats it as a newline instead of submitting.

xterm.js sends bare CR for both Enter and Shift+Enter by default, so Claude Code couldn't distinguish them. The new keydown handler in setupTerminal intercepts plain Shift+Enter (ignoring Ctrl/Cmd/Alt combos and IME composition), prevents xterm's default, and writes \x1b\r via session.send. Same byte sequence macOS Terminal emits for Option+Enter, which Claude Code's prompt already parses as a newline.

Adjacent to the drag-drop work (lives in the same per-terminal setupTerminal); happy to split into its own PR if you'd prefer.

Commit: 3f30a46.

@pjdoland
Copy link
Copy Markdown
Collaborator Author

Ripped the Shift+Enter handler back out (commit c105267).

JupyterLab already merged the upstream fix in jupyterlab/jupyterlab#18523, shipping in 4.6.0a5. Their version is the right shape:

  • Uses xterm.js's first-party attachCustomKeyEventHandler rather than a DOM capture-phase listener.
  • Sends \n (LF), which is the cross-TUI standard that VS Code's terminal, Ink, Bubble Tea, fish, and prompt-toolkit all already accept as a soft newline. Ours was sending \x1b\r, which works for Claude Code but is narrower.
  • Has a keyCode !== 229 IME guard ours was missing.
  • Applies to every JL terminal, not just the ones this extension wires.

Keeping ours alongside would have caused dual-fire (our capture-phase listener runs before xterm.js sees the event, so JL's fix would never trigger when this extension is active) and would have diverged on the injected byte. Cleaner to defer.

Filed jupyterlab/jupyterlab#18888 before noticing the merged PR; will close after testing 4.6.0a5.

@pjdoland pjdoland force-pushed the feat/248-terminal-drag-drop branch from c105267 to 3b3b3da Compare May 16, 2026 01:43
pjdoland added a commit to pjdoland/notebook-intelligence that referenced this pull request May 16, 2026
Five fix-before-merge findings from the review pass:

1. Stale-closure dedupe in attachWorkspacePaths. The drop listener
   captured `selectedContextFiles` at first render, so repeated drops
   of the same path landed duplicate context pills. Move dedupe inside
   the setSelectedContextFiles functional updater so it reads the
   freshest state.

2. Inconsistent error surfacing on chat drop. The terminal path uses
   Notification.warning toasts; the chat path was emitting a
   chat-message-notice that scrolled out of view. Switch to
   Notification.warning to match.

3. Missing keyboard focus after chat drop. Users had to click into the
   prompt textarea before typing. Call promptInputRef.current.focus()
   after a successful attach.

4. Backend image branch only matched `is_image AND is_upload`. File-
   browser image drags set `isImage: true` but not `source: 'upload'`,
   so OpenAI-vision providers received a text breadcrumb without image
   bytes. Drop the is_upload constraint so the workspace image path
   reaches the vision payload too.

5. Path-traversal sandbox on workspace-relative context paths. Backend
   was joining `root_dir + filePath` without normalization; ``..`` and
   absolute paths leaked through. Resolve via realpath + commonpath
   and reject anything that escapes root_dir.

Regression coverage: test_workspace_image_drag_reaches_vision_provider
pins the non-upload image branch; test_path_traversal_outside_workspace
and test_absolute_path_outside_workspace pin the sandbox.
@mbektas
Copy link
Copy Markdown
Collaborator

mbektas commented May 16, 2026

hi @pjdoland this is a great feature. some comments:

  • why do we need the / mode and the toolbar therefore? as I understand it is just to paste the path. we could just have the @ and user can modify when needed.
  • I am running into an issue, if I press enter right after dropping the file, like shown in the capture. it still does the default drop action then (opens the file in new tab). then I need to press enter one more time.
nbi-drag-@-issue

pjdoland added 7 commits May 16, 2026 10:19
Closes plmbr#248.

Drag files from the OS desktop (or the JupyterLab file browser) onto a
terminal pane to insert `@<path>` (Claude Code @-mention syntax) or a
shell-escaped raw absolute path at the cursor. Per-terminal toolbar
toggle switches the default mode; Shift-drop inverts for one drop.

Reuses the existing `/notebook-intelligence/upload-file` endpoint so
chat-sidebar uploads share the new lifecycle work:
- `NBI_UPLOAD_MAX_MB` (default 50): per-file size cap, 413 on exceed.
- `NBI_UPLOAD_RETENTION_HOURS` (default 24): lazy sweep of stale subdirs
  rate-limited to once per 60s so burst uploads don't pay the listdir
  cost on every request.
- `NBI_TERMINAL_DRAG_DROP_POLICY` (force-on / force-off / user-choice):
  admin lock to suppress the listener for hardened deployments.

The frontend handler reads the policy off the raw capabilities object
(not the IFeaturePolicies getter) because dragover fires at ~60Hz and
the getter rebuilds 11 child objects per call. Upload failures surface
via JL Notification toast so screen-reader users see the same signal
as sighted users; the chat sidebar's silent `console.error` path was
the prior baseline.

Pure helpers (formatForMode, invertMode) live in a separate format
module so jest can exercise them without pulling Lumino/DOM imports.
`shellSingleQuote` moves from utils.ts (which loads tiktoken at import
time) to a tiny shell-utils.ts so the same applies to it; utils.ts
re-exports for backward compat.
The file browser starts its Lumino Drag with `supportedActions: 'move'`.
Our lm-dragover handlers set `dropAction = 'copy'`, which Lumino's
validateAction reduces to 'none' since 'copy' isn't in the supported
set. With _dropAction === 'none', lm-drop is never dispatched on
pointerup — the drag silently does nothing.

Mirror the source's proposedAction back as dropAction so we stay
inside whatever the source supports.

Also extend the chat sidebar with a document-level lm-* listener pair
(matching the terminal-drag pattern) so dragging a workspace file from
the file browser onto the chat panel attaches it as context via the
same code path the @-mention picker uses.

Verified end-to-end via Playwright simulation: terminal injects
@<path>, chat sidebar grows a context pill.
When dragging a workspace image from the JL file browser onto the chat
sidebar, fetch the model's base64 content via ContentsManager and
attach it as image context (with the data URL set as the thumbnail
source). Workspace path doubles as the backend serverPath since the
chat backend already resolves workspace-relative paths.

Mirrors the OS-desktop drag-and-drop behavior so users don't see a
"Binary files cannot be attached" rejection when dragging a screenshot
from the file browser.
Five fix-before-merge findings from the review pass:

1. Stale-closure dedupe in attachWorkspacePaths. The drop listener
   captured `selectedContextFiles` at first render, so repeated drops
   of the same path landed duplicate context pills. Move dedupe inside
   the setSelectedContextFiles functional updater so it reads the
   freshest state.

2. Inconsistent error surfacing on chat drop. The terminal path uses
   Notification.warning toasts; the chat path was emitting a
   chat-message-notice that scrolled out of view. Switch to
   Notification.warning to match.

3. Missing keyboard focus after chat drop. Users had to click into the
   prompt textarea before typing. Call promptInputRef.current.focus()
   after a successful attach.

4. Backend image branch only matched `is_image AND is_upload`. File-
   browser image drags set `isImage: true` but not `source: 'upload'`,
   so OpenAI-vision providers received a text breadcrumb without image
   bytes. Drop the is_upload constraint so the workspace image path
   reaches the vision payload too.

5. Path-traversal sandbox on workspace-relative context paths. Backend
   was joining `root_dir + filePath` without normalization; ``..`` and
   absolute paths leaked through. Resolve via realpath + commonpath
   and reject anything that escapes root_dir.

Regression coverage: test_workspace_image_drag_reaches_vision_provider
pins the non-upload image branch; test_path_traversal_outside_workspace
and test_absolute_path_outside_workspace pin the sandbox.
xterm.js sends a bare CR for both Enter and Shift+Enter, so Claude
Code's prompt input parses both as "submit." Intercept Shift+Enter
at the widget level (capture-phase keydown), prevent xterm's default,
and write the ESC+CR meta-enter byte sequence to the PTY via
session.send. macOS Terminal emits the same bytes for Option+Enter
and Claude Code's prompt parses them as a newline.

Cross-platform; respects IME composition and ignores Ctrl/Cmd/Alt
modifier combos so it doesn't shadow other terminal shortcuts.
Mehmet reported that pressing Enter immediately after dropping a file
from the JupyterLab file-browser into a terminal opened the file in a
new tab instead of submitting the terminal command. Two Enter presses
were needed. Same class of bug applied to chat-sidebar drops via HTML5
drag, the file dialog, and image paste.

Root cause: after a successful drop, focus stays on the file-browser
(which still has the file selected). The file-browser's keyboard handler
treats Enter on a selected row as "open the selected item," so the first
Enter is intercepted before the drop target sees it.

Fix:
- terminal-drag: call widget.activate() after paste so the MainAreaWidget
  takes focus and (if it's a background tab in a split) is raised. Skip
  via isDisposed when the async upload path returns to a closed terminal.
- chat-sidebar: focus the prompt textarea after processAndAttachFiles
  succeeds, matching the existing focus call in the lm-drop path.

Tests:
- New jest regression test asserts activate() runs after paste() and
  covers a descendant-of-host drop target (not just the host node).
- afterEach disposes test widgets so document-level lm-* listeners
  registered by setupTerminal don't leak across tests in the file.
- Added @jupyterlab/apputils stub and a DragEvent → MouseEvent subclass
  shim so the lm-drop test can import terminal-drag.ts under jest/jsdom.

Six-agent review converged on the focus fix as correct; followups for
target-size etc. were deferred as out of scope.
@pjdoland pjdoland force-pushed the feat/248-terminal-drag-drop branch from 3b3b3da to 306ccfb Compare May 16, 2026 15:57
@pjdoland
Copy link
Copy Markdown
Collaborator Author

pjdoland commented May 16, 2026

Pushed two updates in 306ccfb:

  1. Rebased onto current main to clear the merge conflict. The conflicting hunks were the _resolve_positive_int_with_env rename in extension.py (kept main's name; renamed the local callsites for NBI_UPLOAD_MAX_MB / NBI_UPLOAD_RETENTION_HOURS).

  2. Fix for issue above: after a file-browser drop, focus stayed on the file-browser (which still had the file selected). The file-browser's keyboard handler treats Enter on a selected row as "open the selected item," so the first Enter was intercepted before the drop target saw it.

The fix calls widget.activate() on the outer MainAreaWidget after paste() runs. That focuses xterm and also raises the terminal tab if it's a background tab in a split. The async upload path now guards on isDisposed so a drop into a terminal that gets closed mid-upload doesn't throw.

Same bug class applied to processAndAttachFiles in the chat sidebar (HTML5 drop, file dialog, image paste), which previously skipped the focus call that the lm-drop path already had. Added it there too.

Tests: new jest regression covers the activate-after-paste ordering and a descendant-of-host drop target; afterEach disposes test widgets so document-level lm-* listeners don't leak.

Verified locally: drag a file onto a terminal, press Enter, the @filename line submits instead of opening the file in a new tab. Six-agent review (resolve-nbi-issues) converged on the fix.

…ball

The TestArchiveSizeCap cases were patching
notebook_intelligence.skill_github_import._get_github_token, which
doesn't exist (the module imports resolve_github_token from
notebook_intelligence.util). mock.patch raised AttributeError at
context-manager enter, breaking the suite on every fresh clone.

Patch the imported reference in the skill_github_import namespace
(resolve_github_token) so the mock takes effect inside _fetch_tarball's
call site.
@mbektas mbektas merged commit cab6fec into plmbr:main May 16, 2026
4 checks passed
@pjdoland pjdoland deleted the feat/248-terminal-drag-drop branch May 16, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature proposal: drag files from desktop into JupyterLab terminal (Claude Code @-mention support)

2 participants