Skip to content

wip: use [Image #X] placeholder for clipboard paste and drag and drop#14706

Closed
jackwotherspoon wants to merge 8 commits intomainfrom
image-placeholder
Closed

wip: use [Image #X] placeholder for clipboard paste and drag and drop#14706
jackwotherspoon wants to merge 8 commits intomainfrom
image-placeholder

Conversation

@jackwotherspoon
Copy link
Copy Markdown
Collaborator

@jackwotherspoon jackwotherspoon commented Dec 8, 2025

Summary

This branch improves how pasted and drag-and-dropped images are handled in the
Gemini CLI input.

drag.and.drop.images.mp4

Details

Before: Images were inserted as @path/to/image.png file references,
requiring the model to resolve them.

After: Images are displayed as [Image #1], [Image #2], etc. placeholders
in the input, then injected directly as base64-encoded inline data when
submitting to the Gemini API.

Key Features

  1. Visual placeholders - Users see [Image #N] tags that are
    syntax-highlighted and editable
  2. Deletable references - Users can remove image tags before submitting;
    only images with remaining tags are sent
  3. Multi-file drag-and-drop - Supports dropping multiple images at once,
    with proper handling of escaped spaces in filenames
  4. Mixed content - Non-image files in a multi-drop fall back to @path
    syntax

Files Changed

File Purpose
useClipboardImages.ts New hook managing image registry and base64 conversion
clipboardUtils.ts Path parsing, validation, multi-file splitting
highlight.ts Syntax highlighting for [Image #N] tokens
InputPrompt.tsx Integration with paste/drop handling
useGeminiStream.ts Injects image parts into API requests

Supported Formats

PNG, JPEG, WEBP, HEIC, HEIF (per
Gemini API spec)

Details

Related Issues

How to Validate

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@jackwotherspoon jackwotherspoon requested a review from a team as a code owner December 8, 2025 13:56
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @jackwotherspoon, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the user experience for interacting with image content within the Gemini CLI. Instead of relying on the model to resolve local file paths, images are now displayed as user-friendly [Image #N] placeholders in the input. These placeholders are then converted to base64-encoded inline data and directly injected into the Gemini API request, streamlining the process and improving reliability for image-based prompts.

Highlights

  • Image Handling Improvement: Pasted and drag-and-dropped images are now represented by [Image #N] placeholders in the input, rather than @path references.
  • Base64 Injection: Images are base64-encoded and injected directly into the Gemini API request upon submission, improving reliability and streamlining the process.
  • Interactive Placeholders: [Image #N] tags are syntax-highlighted and editable, allowing users to remove image references before submission, and only images with remaining tags are sent.
  • Multi-file Drag-and-Drop: The system now supports dropping multiple images at once, correctly handling escaped spaces in filenames.
  • Mixed Content Fallback: Non-image files included in a multi-drop scenario will gracefully fall back to the traditional @path syntax.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant improvement for handling images by using placeholders like [Image #1] instead of file paths. The implementation is well-structured, introducing a new useClipboardImages hook, utility functions for image path parsing, and integrating this new flow into the input prompt and submission stream. The code is generally robust, with good handling of asynchronous operations and edge cases like escaped spaces in file paths. I've identified one high-severity issue where adding the same image multiple times does not behave as a user would expect. Please see the detailed comment.

Comment on lines +111 to +114
// Check if this path is already registered to prevent duplicates
if (prev.some((img) => img.path === absolutePath)) {
return prev;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation prevents registering an image if its path already exists in the images array. This can lead to unexpected behavior for the user. For example, if a user pastes the same image twice, they will see two distinct placeholders in the input (e.g., [Image #1] and [Image #2]), but only the first one will be registered and sent to the API. The second placeholder will be ignored, which is not what the user would expect.

To fix this, you should allow registering the same image path multiple times, each with its own unique ID. This ensures that each placeholder in the UI corresponds to an image that will be included in the prompt.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 8, 2025

Size Change: +8.95 kB (+0.04%)

Total Size: 21.5 MB

Filename Size Change
./bundle/gemini.js 21.5 MB +8.95 kB (+0.04%)
ℹ️ View Unchanged
Filename Size
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB
./bundle/sandbox-macos-permissive-open.sb 890 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB

compressed-size-action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant