Skip to content

feat(provider): add openai-compatible custom tool compat#16531

Open
Delqhi wants to merge 1 commit intoanomalyco:devfrom
Other-Open-Source-Projects:feat/custom-provider-compat
Open

feat(provider): add openai-compatible custom tool compat#16531
Delqhi wants to merge 1 commit intoanomalyco:devfrom
Other-Open-Source-Projects:feat/custom-provider-compat

Conversation

@Delqhi
Copy link
Copy Markdown

@Delqhi Delqhi commented Mar 7, 2026

Issue for this PR

Closes #234
Closes #15756

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

This adds an opt-in compatibility layer for custom @ai-sdk/openai-compatible providers that do not emit modern tool_calls reliably.

The changes do three things:

  • rewrite outgoing tools / tool_choice payloads into legacy functions / function_call when the raw-function-call parser is enabled
  • repair incoming legacy function_call responses into structured tool calls before OpenCode processes them
  • optionally recover structured tool intent from JSON or single-tool text replies for providers that return text instead of native tool calls

The goal is to keep OpenCode sessions moving for custom providers that can express tool intent, but do not fully match the modern OpenAI tool-calling shape.

How did you verify your code works?

  • bun test test/provider/openai-compatible-compat.test.ts
  • bun test test/provider/copilot/copilot-chat-model.test.ts
  • bun test test/session/llm.test.ts
  • bun test test/provider/provider.test.ts
  • bun run typecheck

Screenshots / recordings

Not a UI change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

Copilot AI review requested due to automatic review settings March 7, 2026 23:15
@github-actions github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Mar 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in compatibility layer for custom @ai-sdk/openai-compatible providers that don't support modern tool_calls reliably. It introduces a toolParser option that can be configured per provider to rewrite tool-related request/response payloads between legacy (function_call/functions) and modern (tool_calls/tools) formats, and to recover tool call intent from structured text or JSON content.

Changes:

  • New openai-compatible-compat.ts module implementing request body rewriting (modern→legacy), JSON response rewriting (legacy→modern), and SSE stream rewriting with three parser types: raw-function-call, json, and single-tool-text
  • Integration into the existing fetch wrapper in provider.ts to intercept and transform requests/responses when toolParser is configured
  • New test file with unit tests covering the four main rewrite paths

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
packages/opencode/src/provider/openai-compatible-compat.ts New compat layer with functions to parse tool parsers config, rewrite request bodies to legacy format, and rewrite JSON/streaming responses to modern tool_calls format
packages/opencode/src/provider/provider.ts Imports compat functions and integrates them into the custom fetch wrapper: rewrites outgoing POST bodies and intercepts responses for transformation
packages/opencode/test/provider/openai-compatible-compat.test.ts Unit tests for request body rewriting, JSON response rewriting, SSE stream single-tool-text rewriting, and SSE stream JSON tag rewriting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
}

const SYNTHETIC_TOOL_CALL_ID = "call_opencode_compat_0"
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synthetic tool call ID is a static constant "call_opencode_compat_0", meaning every synthetic tool call across all response cycles will have the same ID. In multi-turn conversations, the conversation history will contain multiple assistant messages with identical tool_calls[0].id and matching tool result tool_call_id values. This can confuse the model when it sees the conversation history, as it cannot distinguish which tool result corresponds to which tool call.

Consider generating a unique ID per call, e.g., using crypto.randomUUID() or a counter-based approach like call_opencode_compat_${Date.now()}_${Math.random().toString(36).slice(2, 8)}.

Suggested change
const SYNTHETIC_TOOL_CALL_ID = "call_opencode_compat_0"
const SYNTHETIC_TOOL_CALL_ID = `call_opencode_compat_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`

Copilot uses AI. Check for mistakes.
Comment on lines +1157 to +1163
if (contentType.includes("text/event-stream")) {
const text = await response.text()
return new Response(rewriteOpenAICompatibleStreamResponse(text, toolParsers), {
status: response.status,
statusText: response.statusText,
headers,
})
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When toolParsers.length > 0, every streaming response (text/event-stream) is fully buffered via response.text() before being re-serialized and returned. This effectively disables incremental streaming for all requests when any tool parser is configured, meaning users won't see any intermediate output (like reasoning text) until the entire response completes. For long completions, this could appear as a significant delay/hang.

For the raw-function-call parser alone, incremental transformation would be possible since transformChunkEvent already operates per-event. The full buffering is only strictly necessary when json or single-tool-text parsers are active (since they need the complete content to decide). Consider short-circuiting to an incremental transform when only raw-function-call is active.

Copilot uses AI. Check for mistakes.
Comment on lines +1150 to +1183
if (toolParsers.length === 0) {
return response
}

const headers = new Headers(response.headers)
headers.delete("content-length")
const contentType = headers.get("content-type") ?? ""
if (contentType.includes("text/event-stream")) {
const text = await response.text()
return new Response(rewriteOpenAICompatibleStreamResponse(text, toolParsers), {
status: response.status,
statusText: response.statusText,
headers,
})
}

if (contentType.includes("application/json")) {
const text = await response.text()
try {
return new Response(JSON.stringify(rewriteOpenAICompatibleJsonResponse(JSON.parse(text), toolParsers)), {
status: response.status,
statusText: response.statusText,
headers,
})
} catch {
return new Response(text, {
status: response.status,
statusText: response.statusText,
headers,
})
}
}

return response
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response rewriting logic runs for every response when toolParsers.length > 0, including non-chat-completions endpoints (e.g., model listing). While rewriteOpenAICompatibleJsonResponse is a no-op when choices[0] doesn't exist, the response body is still fully consumed (via response.text()) and re-wrapped in a new Response object unnecessarily. Consider guarding the response rewriting with a check on the request URL or method to only intercept chat completions responses, similar to how the request body rewriting checks opts.method === "POST".

Copilot uses AI. Check for mistakes.
@Delqhi
Copy link
Copy Markdown
Author

Delqhi commented Mar 7, 2026

Repro / before-after summary:

Before

  • custom @ai-sdk/openai-compatible provider sends plain text or legacy function_call instead of modern tool_calls
  • OpenCode sees a normal stop and the session ends after the first assistant reply

After with opt-in config

{
  "provider": {
    "custom-provider": {
      "npm": "@ai-sdk/openai-compatible",
      "api": "https://api.example.com/v1",
      "options": {
        "toolParser": [
          { "type": "raw-function-call" },
          { "type": "json" },
          { "type": "single-tool-text", "tool": "bash", "argument": "command" }
        ]
      }
    }
  }
}
  • outgoing tools/tool_choice can be rewritten into legacy functions/function_call
  • incoming legacy function_call gets normalized into tool_calls
  • text-only or tagged JSON tool intents can be recovered into a structured tool call before OpenCode consumes the response

That means broken custom providers can keep multi-turn sessions alive without changing behavior for normal providers, because the whole layer is opt-in through provider.options.toolParser.

Verified locally with:

  • bun test test/provider/openai-compatible-compat.test.ts
  • bun test test/provider/copilot/copilot-chat-model.test.ts
  • bun test test/session/llm.test.ts
  • bun test test/provider/provider.test.ts
  • bun run typecheck

@github-actions github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Mar 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 8, 2026

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@daniel-farina
Copy link
Copy Markdown

I tested this and it helped me get gemma working with open code, documented here:

https://gist.github.com/daniel-farina/87dc1c394b94e45bb700d27e9ea03193

@brunostc
Copy link
Copy Markdown

brunostc commented Apr 8, 2026

Can we get this prioritized? -.-

@fenneclabs
Copy link
Copy Markdown

interested by that fix as well

@BrutchsamaJeanLouis
Copy link
Copy Markdown

looking forward to this merge

@deadbaed
Copy link
Copy Markdown

deadbaed commented Apr 9, 2026

We could get traction to this PR to get it merged. Would it be possible to rebase from latest dev branch @Delqhi?

@AceCodePt
Copy link
Copy Markdown

Same. I would reeeally like to try Gemma working with opencode.

@BrutchsamaJeanLouis
Copy link
Copy Markdown

BrutchsamaJeanLouis commented Apr 9, 2026

Findings from testing Gemma 4 (31B) + llama.cpp + thinking enabled

Spent some time debugging Gemma 4 tool calling with latest llama.cpp (b8736, PRs #21326 + #21418 merged) and wanted to share what I found — some of these may be relevant to this PR's scope.

1. reasoning_content in streaming deltas breaks tool_call detection

This is the biggest issue. When Gemma 4 has thinking enabled, llama.cpp sends streaming deltas with a reasoning_content field:

{"delta": {"reasoning_content": "The user wants to..."}}

The AI SDK sees ~30 chunks with an unknown field and empty content, then misses the tool_calls that arrive in the final chunks. By the time the toolParser layer runs, the SDK has already dropped them.

This probably needs to be handled in the fetch wrapper (where the response is intercepted), not in the parser layer — strip or drop reasoning_content from deltas before the SDK processes them.

2. <tool_use> tag + array format

Some plugins (e.g. oh-my-openagent) instruct models to emit tool calls as:

<tool_use>
[
  {"tool": "read", "parameters": {"filePath": "README.md"}},
  {"tool": "bash", "parameters": {"command": "ls -R"}}
]
</tool_use>

Two things the current json parser wouldn't catch:

  • <tool_use> tag (PR only extracts <tool_call>)
  • Array of tool calls in a single block (PR parses single objects)
  • parameters key (toToolCall checks arguments/input/args but not parameters)

3. KV cache invalidation from response rewriting

If the compat layer converts reasoning_contentcontent in responses, the conversation history OpenCode sends back won't match what llama.cpp originally generated. The Jinja template formats these differently (<think> tags vs plain text), so the token prefix diverges → KV cache miss every turn → progressive slowdown on longer conversations.

Stripping reasoning_content entirely from responses (rather than converting) avoids this — the reasoning still happens server-side and improves output quality, it just isn't displayed.

Test setup

  • llama.cpp b8736 with --jinja, router mode, Gemma 4 31B Q4_K_M
  • chat-template-kwargs = {"enable_thinking": true}, reasoning-budget = 140
  • OpenCode v1.4.0 on Windows, @ai-sdk/openai-compatible provider
  • Verified llama.cpp returns correct tool_calls + finish_reason: "tool_calls" at the API level (both streaming and non-streaming via curl)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SDK session.prompt() returns empty responses with custom OpenAI-compatible provider Tool Calling Issues with Open Source Models in OpenCode

8 participants