fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType#1682
Conversation
…and unify mimeType Move base64 image data from Claude tool_result into functionResponse.parts as inlineData instead of outer sibling parts, preventing context bloat. Unify all inlineData field naming to camelCase mimeType across Claude, OpenAI, and Gemini translators. Add comprehensive edge case tests and Gemini-side regression test for functionResponse.parts preservation.
Summary of ChangesHello @sususu98, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines how image data is processed within tool results, specifically for Claude models, by relocating base64 images to a dedicated Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the issue of base64 image data from Claude tool_result bloating the context by moving it into functionResponse.parts. The changes are well-implemented, particularly the logic to separate image and non-image content from tool results. The addition of 7 new test cases is excellent and covers various scenarios, including edge cases, which greatly improves confidence in the change.
The unification of mime_type to mimeType across the Claude, OpenAI, and Gemini translators is also a great cleanup that improves consistency.
I have one suggestion regarding code duplication in the Claude translator to improve maintainability. Overall, this is a solid contribution.
| if functionResponseResult.Get("type").String() == "image" && functionResponseResult.Get("source.type").String() == "base64" { | ||
| inlineDataJSON := `{}` | ||
| if mimeType := functionResponseResult.Get("source.media_type").String(); mimeType != "" { | ||
| inlineDataJSON, _ = sjson.Set(inlineDataJSON, "mimeType", mimeType) | ||
| } | ||
| if data := functionResponseResult.Get("source.data").String(); data != "" { | ||
| inlineDataJSON, _ = sjson.Set(inlineDataJSON, "data", data) | ||
| } | ||
|
|
||
| imagePartJSON := `{}` | ||
| imagePartJSON, _ = sjson.SetRaw(imagePartJSON, "inlineData", inlineDataJSON) | ||
| imagePartsJSON := "[]" | ||
| imagePartsJSON, _ = sjson.SetRaw(imagePartsJSON, "-1", imagePartJSON) | ||
| functionResponseJSON, _ = sjson.SetRaw(functionResponseJSON, "parts", imagePartsJSON) | ||
| functionResponseJSON, _ = sjson.Set(functionResponseJSON, "response.result", "") | ||
| } else { | ||
| functionResponseJSON, _ = sjson.SetRaw(functionResponseJSON, "response.result", functionResponseResult.Raw) | ||
| } |
There was a problem hiding this comment.
There's significant code duplication between this block (for handling IsObject) and the logic inside the IsArray loop on lines 231-243. Both sections build inlineData and imagePart JSON for base64 images.
To improve maintainability and reduce redundancy, consider extracting this logic into a helper function. For example, you could create a function that takes a gjson.Result representing an image content part and returns the corresponding imagePartJSON string.
Example of a potential helper:
func createImagePartFromContent(content gjson.Result) (string, bool) {
if content.Get("type").String() != "image" || content.Get("source.type").String() != "base64" {
return "", false
}
inlineDataJSON := `{}`
if mimeType := content.Get("source.media_type").String(); mimeType != "" {
inlineDataJSON, _ = sjson.Set(inlineDataJSON, "mimeType", mimeType)
}
if data := content.Get("source.data").String(); data != "" {
inlineDataJSON, _ = sjson.Set(inlineDataJSON, "data", data)
}
imagePartJSON := `{}`
imagePartJSON, _ = sjson.SetRaw(imagePartJSON, "inlineData", inlineDataJSON)
return imagePartJSON, true
}This would make the code cleaner and easier to manage.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
1 similar comment
This comment was marked as spam.
This comment was marked as spam.
…mage-parts fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType
…mage-parts fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType
…mage-parts fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType
Summary
tool_resultcontent intofunctionResponse.partsasinlineData, so the Gemini API can properly interpret them as images instead of opaque JSON text.inlineDatafield naming to camelCasemimeTypeacross Claude, OpenAI, and Gemini translators (was inconsistently usingmime_typein some paths).Problem
When a Claude
tool_resultcontains image content (e.g., screenshots from tool calls), the image block was dumped as raw Claude-format JSON intofunctionResponse.response.result:{ "functionResponse": { "response": { "result": [ {"type":"text","text":"..."}, {"type":"image","source":{"type":"base64","media_type":"image/png","data":"..."}} ] } } }The Gemini API cannot interpret this Claude-format image object — it treats it as opaque text, wasting context and losing the actual image data.
Additionally,
inlineDatafield naming was inconsistent: some code paths used snake_casemime_typeinstead of the correct camelCasemimeType, affecting both user-input images (Claude translator) and file attachments (OpenAI translator).Solution
antigravity_claude_request.go): Detecttype: "image"+source.type: "base64"entries withintool_resultcontent. Convert them to GeminiinlineDataformat and place them infunctionResponse.parts[]instead of leaving them as raw JSON inresponse.result. Handles both array content (mixed text + images) and single object content (image-only).mime_type→mimeTypefor standalone image content blocks (user-input images).antigravity_openai_request.go): Fixmime_type→mimeTypein 3 places whereinlineDatais constructed for data URIs and file attachments.Tests
functionResponse.partswithinlineDatasurvivesfixCLIToolResponsepipeline.