Skip to content

feat: auto-compress oversized images before API submission#21371

Closed
noah79 wants to merge 1 commit intoanomalyco:devfrom
noah79:feat/image-compression-clipboard
Closed

feat: auto-compress oversized images before API submission#21371
noah79 wants to merge 1 commit intoanomalyco:devfrom
noah79:feat/image-compression-clipboard

Conversation

@noah79
Copy link
Copy Markdown

@noah79 noah79 commented Apr 7, 2026

Summary

Images pasted from clipboard or read via the file read tool can exceed provider API limits (Anthropic's 5MB base64 limit, dimension caps). This adds automatic image compression using sharp at the provider transform layer, covering all image paths uniformly.

This supersedes #6455 (auto-closed after 60 days) with significant improvements, and addresses the read.ts coverage gap noted in #12069.

Changes

  • New packages/opencode/src/util/image.ts — Image compression utility:

    • Three-phase compression: quality reduction → progressive dimension reduction → final fallback
    • Smart format selection: preserves PNG/WebP for transparency, converts opaque images to JPEG
    • Correct 3.75MB raw-byte threshold (base64 inflates ~4/3x → stays under 5MB API limit)
  • Provider transform integration (transform.ts):

    • New compressImages() step runs before message normalization
    • Handles both image and file parts with graceful error fallback
    • Corrupt/unsupported images surface as error text instead of silently failing
    • message() is now async to support compression
  • Read tool validation (read.ts):

    • Images validated and optimized via sharp before base64 encoding
    • Clear error for corrupt or unsupported image formats
  • 23 unit tests covering compression, resizing, format detection, transparency handling, and error cases

Key differences from #6455

Original PR This PR
Compression Quality reduction only Three-phase: quality → dimension → fallback
Threshold 4MB (incorrect for base64) 3.75MB raw bytes (correct for 5MB base64 limit)
Coverage Clipboard only All paths via provider transform layer
read.ts Not covered Validated + optimized before encoding
Error handling Silent catch Surfaced as error text to model

Testing

cd packages/opencode
bun test test/util/image.test.ts      # 23 tests pass
bun test test/provider/transform.test.ts  # 121 tests pass
bun run typecheck                     # Clean across all 13 workspace packages

Images pasted from clipboard or read via the file read tool can exceed
provider API limits (Anthropic's 5MB base64 / dimension caps). This adds
automatic image compression using sharp at the provider transform layer,
covering all image paths — clipboard, file read, and any future sources.

Changes:

- New Image utility (packages/opencode/src/util/image.ts):
  - Three-phase compression: quality reduction → dimension reduction → fallback
  - Smart format selection: preserves PNG/WebP for transparency, JPEG for opaque
  - Correct 3.75MB raw-byte threshold (base64 inflates ~4/3x → 5MB limit)

- Provider transform integration (transform.ts):
  - compressImages() runs before message normalization
  - Handles both image and file parts with graceful error fallback
  - Surfacing corrupt/unsupported images as error text rather than silently failing

- Read tool validation (read.ts):
  - Images validated and optimized via sharp before base64 encoding
  - Clear error message for corrupt or unsupported image formats

- 23 unit tests covering compression, resizing, format detection, and edge cases
- All 121 existing transform tests updated and passing
@github-actions github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Apr 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

The following comment was made by an LLM, it may be inaccurate:

Related PRs Found

Related PR:

Note: The original PR #6455 (which this PR supersedes) was auto-closed after 60 days and is not returned in search results, but it's a predecessor rather than a duplicate.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Apr 7, 2026
@github-actions github-actions bot closed this Apr 7, 2026
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…ression

- Add markLargeToolResults() pre-pass: injects cache_control on tool-result
  content parts exceeding 7000 chars (~2000 tokens), covering both Anthropic
  direct and OpenRouter Claude sessions
- Fix stale parts reference bug: collect all large-tool indices first then
  build newContent once (prevents earlier cache marks being overwritten)
- Enforce 4-BP total budget: applyCaching receives preMarkedBPs count and
  reduces maxMsgBreakpoints accordingly
- Add compressImages() async pre-pass: three-phase quality→dimension→fallback
  compression via sharp before API submission (prevents 5MB limit errors)
- Add Image utility (src/util/image.ts): optimizeForUpload, compress, resize
- Wire Image.optimizeForUpload into read.ts image file attachment
- Make ProviderTransform.message() async to support compressImages pipeline
- Add sharp@^0.34.5 dependency (PR anomalyco#21371)
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
Core cache optimizations:
- Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn
  cached at BP1 for sessions with SessionMind context)
- Split failureContext into stableFailures (prior turns, BP1 cached) and
  dynamicFailures (current turn only) using signature-based dedup
- Add markLargeToolResults() pre-pass: cache_control on tool-result content
  parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude
- Fix stale parts reference bug in markLargeToolResults for multi-tool messages
- Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase
  quality->dimension->fallback compression prevents 5MB API limit errors
- Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup
- prompt_async idle race condition fix: check new messages before loop break

Upstream PR cherry-picks:
- PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss
- PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction
- PR anomalyco#21507: session processor single-flight summary dedup improvements
- PR anomalyco#21528: prompt_async idle wakeup race condition fix
- PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm

New tools (PR anomalyco#21399):
- ContextUsageTool (check_context_usage): real-time token/cache usage reporting
- NewSessionTool (new_session): TUI-only, abort + create new session
- TuiEvent.SessionNew bus event and app.tsx handler
- SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type

Test infrastructure:
- E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+
- Unit tests for large-tool cache breakpoints (4 scenarios)
- Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all())
- Add await to all ProviderTransform.message() call sites (now async)
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
fairyhunter13 pushed a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 9, 2026
Core cache optimizations:
- Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn
  cached at BP1 for sessions with SessionMind context)
- Split failureContext into stableFailures (prior turns, BP1 cached) and
  dynamicFailures (current turn only) using signature-based dedup
- Add markLargeToolResults() pre-pass: cache_control on tool-result content
  parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude
- Fix stale parts reference bug in markLargeToolResults for multi-tool messages
- Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase
  quality->dimension->fallback compression prevents 5MB API limit errors
- Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup
- prompt_async idle race condition fix: check new messages before loop break

Upstream PR cherry-picks:
- PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss
- PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction
- PR anomalyco#21507: session processor single-flight summary dedup improvements
- PR anomalyco#21528: prompt_async idle wakeup race condition fix
- PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm

New tools (PR anomalyco#21399):
- ContextUsageTool (check_context_usage): real-time token/cache usage reporting
- NewSessionTool (new_session): TUI-only, abort + create new session
- TuiEvent.SessionNew bus event and app.tsx handler
- SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type

Test infrastructure:
- E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+
- Unit tests for large-tool cache breakpoints (4 scenarios)
- Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all())
- Add await to all ProviderTransform.message() call sites (now async)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants