feat: support openrouter gemini 2.5 flash image preview #791

yyhhyyyyyy · 2025-08-27T02:38:32Z

support openrouter gemini 2.5 flash image preview

Summary by CodeRabbit

New Features
- Streamed responses now include images, emitted as image data events.
- Supports image URLs in OpenAI‑compatible streams and inline image data from Gemini.
- Provides consistent mime types for image data (e.g., image/png or image-url).
- Caches image URLs for faster loading when possible.
Bug Fixes
- Improved resilience: if image caching fails, the original URL is used so images still render without interruption.

coderabbitai · 2025-08-27T02:38:39Z

Walkthrough

Implements streaming-time image handling in OpenAICompatibleProvider: detects image deltas, caches image URLs via devicePresenter, emits image_data events (with cached URL or original), and supports Gemini inline image parts by emitting image_data with provided data and mime type. Regular text delta processing is skipped for chunks containing images.

Changes

Cohort / File(s)	Change Summary
OpenAI-compatible provider image streaming `src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts`	Added logic in handleChatCompletion to detect image deltas from OpenAI-compatible streams and Gemini parts, cache image URLs via presenter.devicePresenter.cacheImage, and emit image_data events with appropriate mimeType. Added warning and fallback when caching fails. No exported API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant L as LLM Stream
  participant P as OpenAICompatibleProvider
  participant D as DevicePresenter
  participant C as Client

  Note over L,P: Streaming chat completion deltas

  L->>P: delta (content/text or image)
  alt delta has image_url
    P->>D: cacheImage(image_url)
    alt cache success
      D-->>P: cachedUrl
      P-->>C: event: image_data(data=cachedUrl, mime=deepchat/image-url)
    else cache failure
      D-->>P: error
      P-->>C: event: image_data(data=image_url, mime=deepchat/image-url)
    end
    Note over P: Skip further delta processing for this chunk
  else delta has Gemini inlineData
    Note over P: parts[].inlineData.{data,mimeType}
    P-->>C: event: image_data(data=inlineData.data, mime=inlineData.mimeType or image/png)
    Note over P: Skip further delta processing for this chunk
  else text/other delta
    P-->>C: existing text/role delta handling
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

fix:openai return base64 image str not been recognized as base64 #647 — Adjusts image-handling and caching behavior in the same provider file, including base64/data-URI processing, indicating closely related logic paths.

Suggested reviewers

zerob13

Poem

A nibble, a hop, a stream in the night,
Images peek in the token-light—delight! 🥕
Cache if you can, fallback if you must,
Emit with mime, in stream we trust.
Little ears perk at each data clue—
Picture by picture, the chat hops through.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/openrouter-gemini-image-generation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3c3cc80 and 99074d8.

📒 Files selected for processing (1)

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/development-setup.mdc)

**/*.{js,jsx,ts,tsx}: 使用 OxLint 进行代码检查
Log和注释使用英文书写

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/{main,renderer}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

src/{main,renderer}/**/*.ts: Use context isolation for improved security
Implement proper inter-process communication (IPC) patterns
Optimize application startup time with lazy loading
Implement proper error handling and logging for debugging

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/main/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

Use Electron's built-in APIs for file system and native dialogs

From main to renderer, broadcast events via EventBus using mainWindow.webContents.send()

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/error-logging.mdc)

**/*.{ts,tsx}: 始终使用 try-catch 处理可能的错误
提供有意义的错误信息
记录详细的错误日志
优雅降级处理
日志应包含时间戳、日志级别、错误代码、错误描述、堆栈跟踪（如适用）、相关上下文信息
日志级别应包括 ERROR、WARN、INFO、DEBUG
不要吞掉错误
提供用户友好的错误信息
实现错误重试机制
避免记录敏感信息
使用结构化日志
设置适当的日志级别

Enable and adhere to strict TypeScript type checking

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/main/presenter/llmProviderPresenter/providers/*.ts

📄 CodeRabbit inference engine (.cursor/rules/llm-agent-loop.mdc)

src/main/presenter/llmProviderPresenter/providers/*.ts: Each file in src/main/presenter/llmProviderPresenter/providers/*.ts should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.
Provider implementations must use a coreStream method that yields standardized stream events to decouple the main loop from provider-specific details.
The coreStream method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Provider files should implement helper methods such as formatMessages, convertToProviderTools, parseFunctionCalls, and prepareFunctionCallPrompt as needed for provider-specific logic.
All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g., prepareFunctionCallPrompt) before making the API call.
When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using convertToProviderTools) and included in the API request.
Provider implementations should aggregate and yield usage events as part of the standardized stream.
Provider implementations should yield image data events in the standardized format when applicable.
Provider implementations should yield reasoning events in the standardized format when applicable.
Provider implementations should yield tool call events (tool_call_start, tool_call_chunk, tool_call_end) in the standardized format.
Provider implementations should yield stop events with appropriate stop_reason in the standardized format.
Provider implementations should yield error events in the standardized format...

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/main/**/*.{ts,js,tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)

主进程代码放在 src/main

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/**/*.{ts,tsx,vue}

📄 CodeRabbit inference engine (CLAUDE.md)

Use English for all logs and comments

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

src/main/presenter/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain one presenter per functional domain in src/main/presenter/

Files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

🧠 Learnings (9)

📓 Common learnings

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should yield image data events in the standardized format when applicable.

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should yield image data events in the standardized format when applicable.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations must use a `coreStream` method that yields standardized stream events to decouple the main loop from provider-specific details.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-08-26T14:13:46.578Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.578Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Each LLM provider must implement provider-specific API interactions, convert MCP tools, and normalize streaming responses

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-08-26T14:13:46.578Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.578Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Implement a coreStream method for new providers following the standardized event interface

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : The `coreStream` method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should aggregate and yield usage events as part of the standardized stream.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

📚 Learning: 2025-07-21T01:46:52.880Z

Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Each file in `src/main/presenter/llmProviderPresenter/providers/*.ts` should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.

Applied to files:

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

🧬 Code graph analysis (1)

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts (1)

src/main/presenter/index.ts (1)

presenter (188-188)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-check (x64)

coderabbitai · 2025-08-27T02:43:32Z

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

+      // 处理图片数据（OpenRouter Gemini 格式）
+      if (delta?.images && Array.isArray(delta.images)) {
+        for (const image of delta.images) {
+          if (image.type === 'image_url' && image.image_url?.url) {
+            try {
+              const cachedUrl = await presenter.devicePresenter.cacheImage(image.image_url.url)
+              yield {
+                type: 'image_data',
+                image_data: {
+                  data: cachedUrl,
+                  mimeType: 'deepchat/image-url'
+                }
+              }
+            } catch (cacheError) {
+              console.warn('[handleChatCompletion] Failed to cache image:', cacheError)
+              yield {
+                type: 'image_data',
+                image_data: {
+                  data: image.image_url.url,
+                  mimeType: 'deepchat/image-url'
+                }
+              }
+            }
+          }
+        }
+        continue
+      }
+


⚠️ Potential issue

Avoid early-continue; it can drop finish_reason/tool_calls in mixed chunks

If a chunk contains both images and finish_reason/tool_calls, the early continue skips subsequent handlers, potentially causing missing stop events or tool-call deltas.

Apply this diff to skip only text parsing for the current chunk while still processing tool_calls/finish_reason:

- // 处理图片数据（OpenRouter Gemini 格式） + // 处理图片数据（OpenRouter Gemini 格式） if (delta?.images && Array.isArray(delta.images)) { for (const image of delta.images) { if (image.type === 'image_url' && image.image_url?.url) { try { const cachedUrl = await presenter.devicePresenter.cacheImage(image.image_url.url) yield { type: 'image_data', image_data: { data: cachedUrl, mimeType: 'deepchat/image-url' } } } catch (cacheError) { console.warn('[handleChatCompletion] Failed to cache image:', cacheError) yield { type: 'image_data', image_data: { data: image.image_url.url, mimeType: 'deepchat/image-url' } } } } } - continue + // Skip only text parsing for this chunk; allow tool_calls/finish_reason handling below. + skipTextForChunk = true }

Add this guard (outside the selected range) right before the character-level text handling:

// declare near other state vars at the top of the loop scope let skipTextForChunk = false

- // 如果没有内容，则继续下一个 chunk - if (!currentContent) continue + // 如果没有内容或已决定跳过文本处理，则继续下一个 chunk + if (!currentContent || skipTextForChunk) continue

coderabbitai · 2025-08-27T02:43:33Z

src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts

+      // 处理 Gemini 原生格式的图片数据（inlineData）
+      if (delta?.content?.parts && Array.isArray(delta.content.parts)) {
+        for (const part of delta.content.parts) {
+          if (part.inlineData && part.inlineData.data) {
+            yield {
+              type: 'image_data',
+              image_data: {
+                data: part.inlineData.data,
+                mimeType: part.inlineData.mimeType || 'image/png'
+              }
+            }
+          }
+        }
+        continue
+      }
+


🛠️ Refactor suggestion

Standardize inline image mimeType and avoid early-continue

Inline Gemini images are emitted with mimeType set to the real image type, but elsewhere you use the sentinel 'deepchat/image-base64'. This inconsistency can break downstream renderers expecting the standardized format.

Same early-continue issue as above.

Apply this diff to emit a consistent mimeType and skip only text parsing:

- // 处理 Gemini 原生格式的图片数据（inlineData） + // 处理 Gemini 原生格式的图片数据（inlineData） if (delta?.content?.parts && Array.isArray(delta.content.parts)) { for (const part of delta.content.parts) { if (part.inlineData && part.inlineData.data) { yield { type: 'image_data', image_data: { - data: part.inlineData.data, - mimeType: part.inlineData.mimeType || 'image/png' + data: part.inlineData.data, + // Use standardized sentinel for base64 payloads for consistency with other paths + mimeType: 'deepchat/image-base64' } } } } - continue + // Skip only text parsing for this chunk; allow tool_calls/finish_reason handling below. + skipTextForChunk = true }

Optional improvement: convert inline base64 to a data URI and cache it via devicePresenter to unify URL-based rendering and reduce large in-memory payloads.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// 处理 Gemini 原生格式的图片数据（inlineData）

if (delta?.content?.parts && Array.isArray(delta.content.parts)) {

for (const part of delta.content.parts) {

if (part.inlineData && part.inlineData.data) {

yield {

type: 'image_data',

image_data: {

data: part.inlineData.data,

mimeType: part.inlineData.mimeType || 'image/png'

}

}

}

}

continue

}

// 处理 Gemini 原生格式的图片数据（inlineData）

if (delta?.content?.parts && Array.isArray(delta.content.parts)) {

for (const part of delta.content.parts) {

if (part.inlineData && part.inlineData.data) {

yield {

type: 'image_data',

image_data: {

data: part.inlineData.data,

// Use standardized sentinel for base64 payloads for consistency with other paths

mimeType: 'deepchat/image-base64'

}

}

}

}

// Skip only text parsing for this chunk; allow tool_calls/finish_reason handling below.

skipTextForChunk = true

}

🤖 Prompt for AI Agents

In src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts around lines 673-688, the handler emits inline Gemini images with their native mimeType and uses an early continue that skips too much; change it to emit the sentinel mimeType 'deepchat/image-base64' for consistency (set image_data.mimeType = 'deepchat/image-base64') and remove the broad continue so only text parsing is skipped (i.e., process image parts but don’t early-return the whole delta handling), and optionally convert the inline base64 into a data URI and register/cache it via devicePresenter to enable URL-based rendering and avoid large in-memory payloads.

feat: support openrouter gemini 2.5 flash image preview

99074d8

coderabbitai bot reviewed Aug 27, 2025

View reviewed changes

zerob13 merged commit 3643fb2 into dev Aug 27, 2025
2 checks passed

coderabbitai bot mentioned this pull request Aug 28, 2025

fix: update Gemini 2.5 Flash Lite model config #805

Closed

zerob13 deleted the feat/openrouter-gemini-image-generation branch January 6, 2026 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support openrouter gemini 2.5 flash image preview #791

feat: support openrouter gemini 2.5 flash image preview #791

Uh oh!

yyhhyyyyyy commented Aug 27, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 27, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 27, 2025

Uh oh!

coderabbitai bot Aug 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: support openrouter gemini 2.5 flash image preview #791

feat: support openrouter gemini 2.5 flash image preview #791

Uh oh!

Conversation

yyhhyyyyyy commented Aug 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yyhhyyyyyy commented Aug 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 27, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)