feat: add OpenAI-compatible LLM provider#240
Conversation
Adds a new 'openai' LLM provider that uses raw fetch to call any OpenAI-compatible /v1/chat/completions endpoint. Supported endpoints: - OpenAI official - DeepSeek - SiliconFlow (硅基流动) - Azure OpenAI - vLLM / LM Studio / Ollama Shares OPENAI_API_KEY with the existing OpenAI embedding provider. Respects OPENAI_BASE_URL and OPENAI_MODEL env vars. Includes OPENAI_API_KEY in detectProvider() and VALID_PROVIDERS. Closes: LLM provider gap for OpenAI-compatible APIs
|
@fatinghenji is attempting to deploy a commit to the rohitg00's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThis PR adds OpenAI-compatible LLM provider support to the memory system. It introduces a new ChangesOpenAI Provider Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugin/scripts/session-end.mjs (1)
34-61:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSequential timeout stacking can stall session-end for minutes.
These fetches run one after another, so under failures/timeouts this path can block for the sum of all timeouts. With the increased values, teardown latency can become very high.
Suggested change (parallelize optional calls, keep per-call timeout)
- if (process.env["CONSOLIDATION_ENABLED"] === "true") { - try { - await fetch(`${REST_URL}/agentmemory/crystals/auto`, { - method: "POST", - headers: authHeaders(), - body: JSON.stringify({ olderThanDays: 0 }), - signal: AbortSignal.timeout(6e4) - }); - } catch {} - try { - await fetch(`${REST_URL}/agentmemory/consolidate-pipeline`, { - method: "POST", - headers: authHeaders(), - body: JSON.stringify({ - tier: "all", - force: true - }), - signal: AbortSignal.timeout(12e4) - }); - } catch {} - } - if (process.env["CLAUDE_MEMORY_BRIDGE"] === "true") try { - await fetch(`${REST_URL}/agentmemory/claude-bridge/sync`, { - method: "POST", - headers: authHeaders(), - signal: AbortSignal.timeout(3e4) - }); - } catch {} + const backgroundTasks = []; + if (process.env["CONSOLIDATION_ENABLED"] === "true") { + backgroundTasks.push( + fetch(`${REST_URL}/agentmemory/crystals/auto`, { + method: "POST", + headers: authHeaders(), + body: JSON.stringify({ olderThanDays: 0 }), + signal: AbortSignal.timeout(6e4) + }).catch(() => {}) + ); + backgroundTasks.push( + fetch(`${REST_URL}/agentmemory/consolidate-pipeline`, { + method: "POST", + headers: authHeaders(), + body: JSON.stringify({ tier: "all", force: true }), + signal: AbortSignal.timeout(12e4) + }).catch(() => {}) + ); + } + if (process.env["CLAUDE_MEMORY_BRIDGE"] === "true") { + backgroundTasks.push( + fetch(`${REST_URL}/agentmemory/claude-bridge/sync`, { + method: "POST", + headers: authHeaders(), + signal: AbortSignal.timeout(3e4) + }).catch(() => {}) + ); + } + await Promise.allSettled(backgroundTasks);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/scripts/session-end.mjs` around lines 34 - 61, The sequential fetch calls in session-end.mjs to `${REST_URL}/agentmemory/crystals/auto`, `${REST_URL}/agentmemory/consolidate-pipeline`, and `${REST_URL}/agentmemory/claude-bridge/sync` can add up their AbortSignal timeouts and stall teardown; change to fire these optional calls in parallel (e.g., collect the individual fetch promises and use Promise.allSettled) while preserving each call’s AbortSignal.timeout and authHeaders(), and handle/log failures per-request rather than awaiting them serially so the total delay is bounded by the longest single timeout instead of the sum.
🧹 Nitpick comments (2)
src/config.ts (1)
53-53: ⚡ Quick winPrefer removing this WHAT-style provider list comment.
This comment describes behavior directly visible in code; keep only rationale comments when needed.
As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/config.ts` at line 53, Remove the WHAT-style comment containing "OpenAI-compatible: supports OpenAI, DeepSeek, SiliconFlow, Azure, vLLM, LM Studio" from src/config.ts; this line duplicates information already expressed in code and violates the guideline to avoid WHAT comments—delete that comment and, if necessary, replace it with a brief rationale comment only (e.g., why compatibility matters) near the relevant config symbol or constant to preserve intent.src/providers/openai.ts (1)
7-25: ⚡ Quick winTrim WHAT-style comments and keep rationale-only docs.
The large provider-description block is mostly implementation/feature listing. Prefer concise naming + minimal rationale comments.
As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/providers/openai.ts` around lines 7 - 25, Replace the large "OpenAI-compatible LLM provider" header comment with a concise rationale-only doc: keep a short one-line description ("OpenAI-compatible LLM provider") and a minimal required/optional env var list (retain OPENAI_API_KEY and optionally OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS) but remove the WHAT-style feature/implementation bullets and example providers; update the top-of-file comment in src/providers/openai.ts (the block currently starting with "OpenAI-compatible LLM provider.") so it only explains purpose and essential configuration variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/config.ts`:
- Around line 169-170: detectLlmProviderKind currently treats any present
OPENAI_API_KEY as selecting "llm" even when users set
OPENAI_API_KEY_FOR_LLM=false; update detectLlmProviderKind to honor the opt-out
by checking that OPENAI_API_KEY exists AND OPENAI_API_KEY_FOR_LLM is not the
string "false" (or equivalent falsy flag) before returning "llm" (e.g., replace
the hasRealValue(env["OPENAI_API_KEY"]) check with
hasRealValue(env["OPENAI_API_KEY"]) && env["OPENAI_API_KEY_FOR_LLM"] !==
"false"); keep the existing MINIMAX_API_KEY logic and ensure detectProvider
remains consistent with this change.
In `@src/providers/openai.ts`:
- Around line 48-64: The fetch in the private async call(systemPrompt: string,
userPrompt: string) method has no timeout/abort, so add an AbortController,
start a timer (e.g. using setTimeout) that calls controller.abort() after a
configurable timeout (use an existing property or add this.requestTimeout), pass
controller.signal to fetch, clear the timer on success, and handle the abort
case (detect DOMException/AbortError and throw or return a descriptive error) so
stalled upstream requests don't hang compression/summarization; reference the
call method, this.baseUrl/this.apiKey, and this.model/maxTokens when applying
the change.
---
Outside diff comments:
In `@plugin/scripts/session-end.mjs`:
- Around line 34-61: The sequential fetch calls in session-end.mjs to
`${REST_URL}/agentmemory/crystals/auto`,
`${REST_URL}/agentmemory/consolidate-pipeline`, and
`${REST_URL}/agentmemory/claude-bridge/sync` can add up their AbortSignal
timeouts and stall teardown; change to fire these optional calls in parallel
(e.g., collect the individual fetch promises and use Promise.allSettled) while
preserving each call’s AbortSignal.timeout and authHeaders(), and handle/log
failures per-request rather than awaiting them serially so the total delay is
bounded by the longest single timeout instead of the sum.
---
Nitpick comments:
In `@src/config.ts`:
- Line 53: Remove the WHAT-style comment containing "OpenAI-compatible: supports
OpenAI, DeepSeek, SiliconFlow, Azure, vLLM, LM Studio" from src/config.ts; this
line duplicates information already expressed in code and violates the guideline
to avoid WHAT comments—delete that comment and, if necessary, replace it with a
brief rationale comment only (e.g., why compatibility matters) near the relevant
config symbol or constant to preserve intent.
In `@src/providers/openai.ts`:
- Around line 7-25: Replace the large "OpenAI-compatible LLM provider" header
comment with a concise rationale-only doc: keep a short one-line description
("OpenAI-compatible LLM provider") and a minimal required/optional env var list
(retain OPENAI_API_KEY and optionally OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS)
but remove the WHAT-style feature/implementation bullets and example providers;
update the top-of-file comment in src/providers/openai.ts (the block currently
starting with "OpenAI-compatible LLM provider.") so it only explains purpose and
essential configuration variables.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 08a6b7a3-cec0-4cba-800b-bf86735a2970
📒 Files selected for processing (6)
plugin/scripts/session-end.mjsplugin/scripts/stop.mjssrc/config.tssrc/providers/index.tssrc/providers/openai.tssrc/types.ts
| hasRealValue(env["MINIMAX_API_KEY"]) || | ||
| hasRealValue(env["OPENAI_API_KEY"]) |
There was a problem hiding this comment.
Honor OPENAI_API_KEY_FOR_LLM=false in provider-kind detection.
detectProvider() supports opt-out, but detectLlmProviderKind() currently reports "llm" whenever OPENAI_API_KEY exists. That creates inconsistent runtime behavior when users intentionally disable OpenAI for LLM.
Suggested patch
if (
hasRealValue(env["ANTHROPIC_API_KEY"]) ||
hasRealValue(env["GEMINI_API_KEY"]) ||
hasRealValue(env["GOOGLE_API_KEY"]) ||
hasRealValue(env["OPENROUTER_API_KEY"]) ||
hasRealValue(env["MINIMAX_API_KEY"]) ||
- hasRealValue(env["OPENAI_API_KEY"])
+ (hasRealValue(env["OPENAI_API_KEY"]) &&
+ env["OPENAI_API_KEY_FOR_LLM"] !== "false")
) {
return "llm";
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| hasRealValue(env["MINIMAX_API_KEY"]) || | |
| hasRealValue(env["OPENAI_API_KEY"]) | |
| if ( | |
| hasRealValue(env["ANTHROPIC_API_KEY"]) || | |
| hasRealValue(env["GEMINI_API_KEY"]) || | |
| hasRealValue(env["GOOGLE_API_KEY"]) || | |
| hasRealValue(env["OPENROUTER_API_KEY"]) || | |
| hasRealValue(env["MINIMAX_API_KEY"]) || | |
| (hasRealValue(env["OPENAI_API_KEY"]) && | |
| env["OPENAI_API_KEY_FOR_LLM"] !== "false") | |
| ) { | |
| return "llm"; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/config.ts` around lines 169 - 170, detectLlmProviderKind currently treats
any present OPENAI_API_KEY as selecting "llm" even when users set
OPENAI_API_KEY_FOR_LLM=false; update detectLlmProviderKind to honor the opt-out
by checking that OPENAI_API_KEY exists AND OPENAI_API_KEY_FOR_LLM is not the
string "false" (or equivalent falsy flag) before returning "llm" (e.g., replace
the hasRealValue(env["OPENAI_API_KEY"]) check with
hasRealValue(env["OPENAI_API_KEY"]) && env["OPENAI_API_KEY_FOR_LLM"] !==
"false"); keep the existing MINIMAX_API_KEY logic and ensure detectProvider
remains consistent with this change.
| private async call(systemPrompt: string, userPrompt: string): Promise<string> { | ||
| const url = `${this.baseUrl}/v1/chat/completions`; | ||
| const response = await fetch(url, { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| Authorization: `Bearer ${this.apiKey}`, | ||
| }, | ||
| body: JSON.stringify({ | ||
| model: this.model, | ||
| max_tokens: this.maxTokens, | ||
| messages: [ | ||
| { role: "system", content: systemPrompt }, | ||
| { role: "user", content: userPrompt }, | ||
| ], | ||
| }), | ||
| }); |
There was a problem hiding this comment.
Add a request timeout to the OpenAI fetch path.
This call currently has no abort/timeout guard, so a stalled upstream can hang compression/summarization indefinitely.
Suggested patch
private async call(systemPrompt: string, userPrompt: string): Promise<string> {
const url = `${this.baseUrl}/v1/chat/completions`;
- const response = await fetch(url, {
- method: "POST",
- headers: {
- "Content-Type": "application/json",
- Authorization: `Bearer ${this.apiKey}`,
- },
- body: JSON.stringify({
- model: this.model,
- max_tokens: this.maxTokens,
- messages: [
- { role: "system", content: systemPrompt },
- { role: "user", content: userPrompt },
- ],
- }),
- });
+ const controller = new AbortController();
+ const timeout = setTimeout(() => controller.abort(), 30_000);
+ let response: Response;
+ try {
+ response = await fetch(url, {
+ method: "POST",
+ signal: controller.signal,
+ headers: {
+ "Content-Type": "application/json",
+ Authorization: `Bearer ${this.apiKey}`,
+ },
+ body: JSON.stringify({
+ model: this.model,
+ max_tokens: this.maxTokens,
+ messages: [
+ { role: "system", content: systemPrompt },
+ { role: "user", content: userPrompt },
+ ],
+ }),
+ });
+ } catch (error) {
+ if ((error as Error).name === "AbortError") {
+ throw new Error("OpenAI API request timed out after 30s");
+ }
+ throw error;
+ } finally {
+ clearTimeout(timeout);
+ }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/providers/openai.ts` around lines 48 - 64, The fetch in the private async
call(systemPrompt: string, userPrompt: string) method has no timeout/abort, so
add an AbortController, start a timer (e.g. using setTimeout) that calls
controller.abort() after a configurable timeout (use an existing property or add
this.requestTimeout), pass controller.signal to fetch, clear the timer on
success, and handle the abort case (detect DOMException/AbortError and throw or
return a descriptive error) so stalled upstream requests don't hang
compression/summarization; reference the call method, this.baseUrl/this.apiKey,
and this.model/maxTokens when applying the change.
|
Reviewed — the OpenAI-compatible provider itself is clean and well-scoped. Like the raw-fetch approach (avoids SDK stainless headers, works against any Two things before merge: 1. Scope creep on the timeout bumps. The hook script timeout changes (5s→30s, 15s→60s, 30s→120s, 30s→120s) in Two options:
2. Update README env block + provider table. The OpenAI entry is missing from the LLM-provider table in README.md and from the env block. Once those land it's a clean v1 of OpenAI support and #185 closes cleanly with this as the successor. Otherwise approving the provider code itself. Will hold the merge for @rohitg00 once the scope question is resolved. |
|
I tested the Ollama side of this PR path against current Ollama Cloud/local behavior and I think this PR can close the Ollama use case if it adds one small passthrough. Findings from 2026-05-12:
The important edge case: for a thinking model ( {
"model": "kimi-k2.6:cloud",
"messages": [{"role":"user","content":"Reply exactly: ok"}],
"max_tokens": 20,
"stream": false
}The response had {"reasoning_effort":"none"}or: {"reasoning":{"effort":"none"}}Native Ollama So for Ollama Cloud via this OpenAI-compatible provider, I think we need one optional env passthrough such as: OPENAI_REASONING_EFFORT=noneand then include OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://127.0.0.1:11434/v1
OPENAI_MODEL=kimi-k2.6:cloud
OPENAI_REASONING_EFFORT=noneor direct cloud: OPENAI_API_KEY=<ollama cloud key>
OPENAI_BASE_URL=https://ollama.com/v1
OPENAI_MODEL=kimi-k2.6
OPENAI_REASONING_EFFORT=noneWithout that, this PR may work for non-thinking models but fail AgentMemory's compression/summarization path on some Ollama Cloud thinking models because the provider treats empty |
|
Closing in favor of a clean rebased PR that addresses all review feedback. |
…lish (#432) Patch bump per the established rule: additive surface only. OpenAI provider is a new optional surface that activates only when OPENAI_API_KEY is set, gated by OPENAI_API_KEY_FOR_LLM. Telemetry project_name pin is pure observability metadata. Compare polish is docs/website only. PRs included since v0.9.16: #307 — OpenAI-compatible LLM provider (universal adapter for OpenAI, Azure OpenAI auto-detected by hostname, DeepSeek, SiliconFlow, vLLM, LM Studio, Ollama). Plus the maintainer- pushed Azure detection + fetch timeout + README scope hint follow-ups. Closes #185, #232, #312, supersedes #240. #426 — pin worker telemetry project_name #427 — Compare section polish (title + native plugins cell + grid) Files bumped (9): package.json, packages/mcp/package.json, plugin/.claude-plugin/plugin.json, plugin/.codex-plugin/plugin.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts, CHANGELOG.md
Summary
Adds a new
openaiLLM provider that uses raw fetch to call any OpenAI-compatible/v1/chat/completionsendpoint.Motivation
Currently, AgentMemory only supports Anthropic, Gemini, OpenRouter, MiniMax, and the agent-sdk fallback for LLM-backed compression and summarization. Users with OpenAI API keys (or keys from OpenAI-compatible services like DeepSeek, SiliconFlow, Azure OpenAI, vLLM, LM Studio) cannot use their existing credentials for the LLM layer, even though the embedding layer already supports
OPENAI_API_KEYviaOpenAIEmbeddingProvider.Changes
src/types.ts: AddopenaitoProviderTypeunionsrc/providers/openai.ts: NewOpenAIProviderclass using raw fetch (no SDK dependency)/v1/chat/completionsendpointOPENAI_API_KEY,OPENAI_BASE_URL,OPENAI_MODELenv varssrc/providers/index.ts: Wireopenaicase intocreateBaseProvider()src/config.ts:OPENAI_API_KEYdetection todetectProvider()(withOPENAI_API_KEY_FOR_LLMopt-out)OPENAI_API_KEYtodetectLlmProviderKind()openaitoVALID_PROVIDERSsetSupported Endpoints
Configuration Example
Backwards Compatibility
OPENAI_API_KEYis now checked first indetectProvider(), but only activates when the key is presentOPENAI_API_KEYfor embedding and prefer another LLM provider can setOPENAI_API_KEY_FOR_LLM=falseto skip auto-detectionTesting
npm run buildpassesnpm testpasses: 838 tests, 75 test filesChecklist
Summary by CodeRabbit
New Features
Bug Fixes