feat: add OpenAI-compatible LLM provider by fatinghenji · Pull Request #240 · rohitg00/agentmemory

fatinghenji · 2026-05-08T12:21:08Z

Summary

Adds a new openai LLM provider that uses raw fetch to call any OpenAI-compatible /v1/chat/completions endpoint.

Motivation

Currently, AgentMemory only supports Anthropic, Gemini, OpenRouter, MiniMax, and the agent-sdk fallback for LLM-backed compression and summarization. Users with OpenAI API keys (or keys from OpenAI-compatible services like DeepSeek, SiliconFlow, Azure OpenAI, vLLM, LM Studio) cannot use their existing credentials for the LLM layer, even though the embedding layer already supports OPENAI_API_KEY via OpenAIEmbeddingProvider.

Changes

src/types.ts: Add openai to ProviderType union
src/providers/openai.ts: New OpenAIProvider class using raw fetch (no SDK dependency)
- Supports any /v1/chat/completions endpoint
- Respects OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL env vars
src/providers/index.ts: Wire openai case into createBaseProvider()
src/config.ts:
- Add OPENAI_API_KEY detection to detectProvider() (with OPENAI_API_KEY_FOR_LLM opt-out)
- Add OPENAI_API_KEY to detectLlmProviderKind()
- Add openai to VALID_PROVIDERS set
- Update no-key warning message

Supported Endpoints

Service	OPENAI_BASE_URL	Notes
OpenAI official	https://api.openai.com (default)
DeepSeek	https://api.deepseek.com/v1
SiliconFlow	https://api.siliconflow.cn/v1
Azure OpenAI	https://{resource}.openai.azure.com/openai/deployments/{deployment}
vLLM / LM Studio	http://localhost:8000/v1 or http://localhost:1234/v1
Ollama	http://localhost:11434/v1	With --enable-openai

Configuration Example

# Embedding + LLM both via SiliconFlow
OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.siliconflow.cn/v1
OPENAI_MODEL=deepseek-chat
OPENAI_EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B
OPENAI_EMBEDDING_DIMENSIONS=4096
EMBEDDING_PROVIDER=openai

Backwards Compatibility

Default behavior unchanged: OPENAI_API_KEY is now checked first in detectProvider(), but only activates when the key is present
Users who only use OPENAI_API_KEY for embedding and prefer another LLM provider can set OPENAI_API_KEY_FOR_LLM=false to skip auto-detection
No breaking changes to existing provider configurations

Testing

npm run build passes
npm test passes: 838 tests, 75 test files

Checklist

Summary by CodeRabbit

New Features
- Added OpenAI LLM provider support with compatibility for OpenAI-compatible APIs, including configurable base URL and model parameters.
Bug Fixes
- Increased network request timeouts for session management, memory consolidation, and summarization operations to improve reliability.

Adds a new 'openai' LLM provider that uses raw fetch to call any OpenAI-compatible /v1/chat/completions endpoint. Supported endpoints: - OpenAI official - DeepSeek - SiliconFlow (硅基流动) - Azure OpenAI - vLLM / LM Studio / Ollama Shares OPENAI_API_KEY with the existing OpenAI embedding provider. Respects OPENAI_BASE_URL and OPENAI_MODEL env vars. Includes OPENAI_API_KEY in detectProvider() and VALID_PROVIDERS. Closes: LLM provider gap for OpenAI-compatible APIs

vercel · 2026-05-08T12:21:11Z

@fatinghenji is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-05-08T12:21:21Z

📝 Walkthrough

Walkthrough

This PR adds OpenAI-compatible LLM provider support to the memory system. It introduces a new OpenAIProvider class, updates provider detection and configuration logic to recognize OpenAI credentials, extends the provider factory to instantiate OpenAI providers, and increases network request timeouts for session lifecycle operations.

Changes

OpenAI Provider Integration

Layer / File(s)	Summary
Type Definitions `src/types.ts`	`ProviderType` union is expanded to include `"openai"` as a supported provider identifier.
OpenAI Provider Class `src/providers/openai.ts`	New `OpenAIProvider` class implements `MemoryProvider` with `compress` and `summarize` methods that POST to `/v1/chat/completions`, with error handling and response validation.
Provider Detection `src/config.ts`	`detectProvider()` recognizes OpenAI config when `OPENAI_API_KEY` is set; `detectLlmProviderKind()` treats `OPENAI_API_KEY` as an llm signal; `loadFallbackConfig()` adds `"openai"` to `VALID_PROVIDERS`; warning text includes `OPENAI_API_KEY`.
Provider Factory `src/providers/index.ts`	`createBaseProvider()` imports `OpenAIProvider` and adds an `"openai"` switch case that constructs `OpenAIProvider` with model, maxTokens, and baseURL from config.
Network Timeouts `plugin/scripts/session-end.mjs`, `plugin/scripts/stop.mjs`	Session lifecycle operations increase `AbortSignal.timeout` values for `/agentmemory/session/end`, `/crystals/auto`, `/consolidate-pipeline`, `/claude-bridge/sync`, and `/summarize` endpoints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

rohitg00/agentmemory#213: Both PRs increase abort/timeout durations for the same memory-related endpoints (session end, consolidation/crystallization, and summarize) used during session shutdown.
rohitg00/agentmemory#188: Both PRs modify detectProvider in src/config.ts to change provider-detection and fallback behavior.
rohitg00/agentmemory#186: Both PRs add OpenAI-compatible endpoint configuration for OpenAI providers including OPENAI_BASE_URL and related OpenAI env handling.

Poem

🐰 A new provider hops into the fold,
OpenAI's messages bright and bold,
With timeouts extended, no more rushing,
The sessions complete, no more hushing.
Type-safe and wired, the config knows,
Where the autumn rabbit's wisdom flows. 🍂

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately describes the main change: adding OpenAI-compatible LLM provider support.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugin/scripts/session-end.mjs (1)

34-61: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sequential timeout stacking can stall session-end for minutes.

These fetches run one after another, so under failures/timeouts this path can block for the sum of all timeouts. With the increased values, teardown latency can become very high.

Suggested change (parallelize optional calls, keep per-call timeout)

- if (process.env["CONSOLIDATION_ENABLED"] === "true") {
-   try {
-     await fetch(`${REST_URL}/agentmemory/crystals/auto`, {
-       method: "POST",
-       headers: authHeaders(),
-       body: JSON.stringify({ olderThanDays: 0 }),
-       signal: AbortSignal.timeout(6e4)
-     });
-   } catch {}
-   try {
-     await fetch(`${REST_URL}/agentmemory/consolidate-pipeline`, {
-       method: "POST",
-       headers: authHeaders(),
-       body: JSON.stringify({
-         tier: "all",
-         force: true
-       }),
-       signal: AbortSignal.timeout(12e4)
-     });
-   } catch {}
- }
- if (process.env["CLAUDE_MEMORY_BRIDGE"] === "true") try {
-   await fetch(`${REST_URL}/agentmemory/claude-bridge/sync`, {
-     method: "POST",
-     headers: authHeaders(),
-     signal: AbortSignal.timeout(3e4)
-   });
- } catch {}
+ const backgroundTasks = [];
+ if (process.env["CONSOLIDATION_ENABLED"] === "true") {
+   backgroundTasks.push(
+     fetch(`${REST_URL}/agentmemory/crystals/auto`, {
+       method: "POST",
+       headers: authHeaders(),
+       body: JSON.stringify({ olderThanDays: 0 }),
+       signal: AbortSignal.timeout(6e4)
+     }).catch(() => {})
+   );
+   backgroundTasks.push(
+     fetch(`${REST_URL}/agentmemory/consolidate-pipeline`, {
+       method: "POST",
+       headers: authHeaders(),
+       body: JSON.stringify({ tier: "all", force: true }),
+       signal: AbortSignal.timeout(12e4)
+     }).catch(() => {})
+   );
+ }
+ if (process.env["CLAUDE_MEMORY_BRIDGE"] === "true") {
+   backgroundTasks.push(
+     fetch(`${REST_URL}/agentmemory/claude-bridge/sync`, {
+       method: "POST",
+       headers: authHeaders(),
+       signal: AbortSignal.timeout(3e4)
+     }).catch(() => {})
+   );
+ }
+ await Promise.allSettled(backgroundTasks);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/scripts/session-end.mjs` around lines 34 - 61, The sequential fetch
calls in session-end.mjs to `${REST_URL}/agentmemory/crystals/auto`,
`${REST_URL}/agentmemory/consolidate-pipeline`, and
`${REST_URL}/agentmemory/claude-bridge/sync` can add up their AbortSignal
timeouts and stall teardown; change to fire these optional calls in parallel
(e.g., collect the individual fetch promises and use Promise.allSettled) while
preserving each call’s AbortSignal.timeout and authHeaders(), and handle/log
failures per-request rather than awaiting them serially so the total delay is
bounded by the longest single timeout instead of the sum.

🧹 Nitpick comments (2)

src/config.ts (1)
53-53: ⚡ Quick win

Prefer removing this WHAT-style provider list comment.

This comment describes behavior directly visible in code; keep only rationale comments when needed.

As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/config.ts` at line 53, Remove the WHAT-style comment containing
"OpenAI-compatible: supports OpenAI, DeepSeek, SiliconFlow, Azure, vLLM, LM
Studio" from src/config.ts; this line duplicates information already expressed
in code and violates the guideline to avoid WHAT comments—delete that comment
and, if necessary, replace it with a brief rationale comment only (e.g., why
compatibility matters) near the relevant config symbol or constant to preserve
intent.
src/providers/openai.ts (1)
7-25: ⚡ Quick win

Trim WHAT-style comments and keep rationale-only docs.

The large provider-description block is mostly implementation/feature listing. Prefer concise naming + minimal rationale comments.

As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/providers/openai.ts` around lines 7 - 25, Replace the large
"OpenAI-compatible LLM provider" header comment with a concise rationale-only
doc: keep a short one-line description ("OpenAI-compatible LLM provider") and a
minimal required/optional env var list (retain OPENAI_API_KEY and optionally
OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS) but remove the WHAT-style
feature/implementation bullets and example providers; update the top-of-file
comment in src/providers/openai.ts (the block currently starting with
"OpenAI-compatible LLM provider.") so it only explains purpose and essential
configuration variables.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/config.ts`:
- Around line 169-170: detectLlmProviderKind currently treats any present
OPENAI_API_KEY as selecting "llm" even when users set
OPENAI_API_KEY_FOR_LLM=false; update detectLlmProviderKind to honor the opt-out
by checking that OPENAI_API_KEY exists AND OPENAI_API_KEY_FOR_LLM is not the
string "false" (or equivalent falsy flag) before returning "llm" (e.g., replace
the hasRealValue(env["OPENAI_API_KEY"]) check with
hasRealValue(env["OPENAI_API_KEY"]) && env["OPENAI_API_KEY_FOR_LLM"] !==
"false"); keep the existing MINIMAX_API_KEY logic and ensure detectProvider
remains consistent with this change.

In `@src/providers/openai.ts`:
- Around line 48-64: The fetch in the private async call(systemPrompt: string,
userPrompt: string) method has no timeout/abort, so add an AbortController,
start a timer (e.g. using setTimeout) that calls controller.abort() after a
configurable timeout (use an existing property or add this.requestTimeout), pass
controller.signal to fetch, clear the timer on success, and handle the abort
case (detect DOMException/AbortError and throw or return a descriptive error) so
stalled upstream requests don't hang compression/summarization; reference the
call method, this.baseUrl/this.apiKey, and this.model/maxTokens when applying
the change.

---

Outside diff comments:
In `@plugin/scripts/session-end.mjs`:
- Around line 34-61: The sequential fetch calls in session-end.mjs to
`${REST_URL}/agentmemory/crystals/auto`,
`${REST_URL}/agentmemory/consolidate-pipeline`, and
`${REST_URL}/agentmemory/claude-bridge/sync` can add up their AbortSignal
timeouts and stall teardown; change to fire these optional calls in parallel
(e.g., collect the individual fetch promises and use Promise.allSettled) while
preserving each call’s AbortSignal.timeout and authHeaders(), and handle/log
failures per-request rather than awaiting them serially so the total delay is
bounded by the longest single timeout instead of the sum.

---

Nitpick comments:
In `@src/config.ts`:
- Line 53: Remove the WHAT-style comment containing "OpenAI-compatible: supports
OpenAI, DeepSeek, SiliconFlow, Azure, vLLM, LM Studio" from src/config.ts; this
line duplicates information already expressed in code and violates the guideline
to avoid WHAT comments—delete that comment and, if necessary, replace it with a
brief rationale comment only (e.g., why compatibility matters) near the relevant
config symbol or constant to preserve intent.

In `@src/providers/openai.ts`:
- Around line 7-25: Replace the large "OpenAI-compatible LLM provider" header
comment with a concise rationale-only doc: keep a short one-line description
("OpenAI-compatible LLM provider") and a minimal required/optional env var list
(retain OPENAI_API_KEY and optionally OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS)
but remove the WHAT-style feature/implementation bullets and example providers;
update the top-of-file comment in src/providers/openai.ts (the block currently
starting with "OpenAI-compatible LLM provider.") so it only explains purpose and
essential configuration variables.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 08a6b7a3-cec0-4cba-800b-bf86735a2970

📥 Commits

Reviewing files that changed from the base of the PR and between 1effa2c and 6879064.

📒 Files selected for processing (6)

plugin/scripts/session-end.mjs
plugin/scripts/stop.mjs
src/config.ts
src/providers/index.ts
src/providers/openai.ts
src/types.ts

coderabbitai · 2026-05-08T12:24:07Z

+    hasRealValue(env["MINIMAX_API_KEY"]) ||
+    hasRealValue(env["OPENAI_API_KEY"])


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor OPENAI_API_KEY_FOR_LLM=false in provider-kind detection.

detectProvider() supports opt-out, but detectLlmProviderKind() currently reports "llm" whenever OPENAI_API_KEY exists. That creates inconsistent runtime behavior when users intentionally disable OpenAI for LLM.

Suggested patch

if ( hasRealValue(env["ANTHROPIC_API_KEY"]) || hasRealValue(env["GEMINI_API_KEY"]) || hasRealValue(env["GOOGLE_API_KEY"]) || hasRealValue(env["OPENROUTER_API_KEY"]) || hasRealValue(env["MINIMAX_API_KEY"]) || - hasRealValue(env["OPENAI_API_KEY"]) + (hasRealValue(env["OPENAI_API_KEY"]) && + env["OPENAI_API_KEY_FOR_LLM"] !== "false") ) { return "llm"; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

hasRealValue(env["MINIMAX_API_KEY"]) ||

hasRealValue(env["OPENAI_API_KEY"])

if (

hasRealValue(env["ANTHROPIC_API_KEY"]) ||

hasRealValue(env["GEMINI_API_KEY"]) ||

hasRealValue(env["GOOGLE_API_KEY"]) ||

hasRealValue(env["OPENROUTER_API_KEY"]) ||

hasRealValue(env["MINIMAX_API_KEY"]) ||

(hasRealValue(env["OPENAI_API_KEY"]) &&

env["OPENAI_API_KEY_FOR_LLM"] !== "false")

) {

return "llm";

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/config.ts` around lines 169 - 170, detectLlmProviderKind currently treats any present OPENAI_API_KEY as selecting "llm" even when users set OPENAI_API_KEY_FOR_LLM=false; update detectLlmProviderKind to honor the opt-out by checking that OPENAI_API_KEY exists AND OPENAI_API_KEY_FOR_LLM is not the string "false" (or equivalent falsy flag) before returning "llm" (e.g., replace the hasRealValue(env["OPENAI_API_KEY"]) check with hasRealValue(env["OPENAI_API_KEY"]) && env["OPENAI_API_KEY_FOR_LLM"] !== "false"); keep the existing MINIMAX_API_KEY logic and ensure detectProvider remains consistent with this change.

coderabbitai · 2026-05-08T12:24:07Z

+  private async call(systemPrompt: string, userPrompt: string): Promise<string> {
+    const url = `${this.baseUrl}/v1/chat/completions`;
+    const response = await fetch(url, {
+      method: "POST",
+      headers: {
+        "Content-Type": "application/json",
+        Authorization: `Bearer ${this.apiKey}`,
+      },
+      body: JSON.stringify({
+        model: this.model,
+        max_tokens: this.maxTokens,
+        messages: [
+          { role: "system", content: systemPrompt },
+          { role: "user", content: userPrompt },
+        ],
+      }),
+    });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a request timeout to the OpenAI fetch path.

This call currently has no abort/timeout guard, so a stalled upstream can hang compression/summarization indefinitely.

Suggested patch

private async call(systemPrompt: string, userPrompt: string): Promise<string> { const url = `${this.baseUrl}/v1/chat/completions`; - const response = await fetch(url, { - method: "POST", - headers: { - "Content-Type": "application/json", - Authorization: `Bearer ${this.apiKey}`, - }, - body: JSON.stringify({ - model: this.model, - max_tokens: this.maxTokens, - messages: [ - { role: "system", content: systemPrompt }, - { role: "user", content: userPrompt }, - ], - }), - }); + const controller = new AbortController(); + const timeout = setTimeout(() => controller.abort(), 30_000); + let response: Response; + try { + response = await fetch(url, { + method: "POST", + signal: controller.signal, + headers: { + "Content-Type": "application/json", + Authorization: `Bearer ${this.apiKey}`, + }, + body: JSON.stringify({ + model: this.model, + max_tokens: this.maxTokens, + messages: [ + { role: "system", content: systemPrompt }, + { role: "user", content: userPrompt }, + ], + }), + }); + } catch (error) { + if ((error as Error).name === "AbortError") { + throw new Error("OpenAI API request timed out after 30s"); + } + throw error; + } finally { + clearTimeout(timeout); + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/providers/openai.ts` around lines 48 - 64, The fetch in the private async call(systemPrompt: string, userPrompt: string) method has no timeout/abort, so add an AbortController, start a timer (e.g. using setTimeout) that calls controller.abort() after a configurable timeout (use an existing property or add this.requestTimeout), pass controller.signal to fetch, clear the timer on success, and handle the abort case (detect DOMException/AbortError and throw or return a descriptive error) so stalled upstream requests don't hang compression/summarization; reference the call method, this.baseUrl/this.apiKey, and this.model/maxTokens when applying the change.

rohitg00 · 2026-05-08T18:33:07Z

Reviewed — the OpenAI-compatible provider itself is clean and well-scoped. Like the raw-fetch approach (avoids SDK stainless headers, works against any /v1/chat/completions endpoint including DeepSeek, SiliconFlow, vLLM, LM Studio). The OPENAI_API_KEY_FOR_LLM=false opt-out is a thoughtful escape hatch for users who already use OPENAI_API_KEY for embeddings only.

Two things before merge:

1. Scope creep on the timeout bumps.

The hook script timeout changes (5s→30s, 15s→60s, 30s→120s, 30s→120s) in plugin/scripts/session-end.mjs and plugin/scripts/stop.mjs are unrelated to the OpenAI provider and don't fit this PR's title. Also they're going the opposite direction of #222, which just landed in spirit (5s → 1.5s ceiling on session-start.ts) to fix engine starvation under parallel claude -p fan-out.

Two options:

Preferred — drop the timeout changes from this PR. If OpenAI-compatible endpoints are slow enough that you needed more headroom, that's a real signal but it deserves its own PR with the use-case spelled out, since fix: make session-start / subagent-start hooks fire-and-forget when no read needed (#221) #222 just argued the opposite direction for session-start.
If you want to keep them, please call out in the PR body why session-end/stop need 4-5× more time when session-start is moving the other way, and confirm with @bunke that this isn't going to undo the perf fix in fix: make session-start / subagent-start hooks fire-and-forget when no read needed (#221) #222.

2. Update README env block + provider table.

The OpenAI entry is missing from the LLM-provider table in README.md and from the env block. Once those land it's a clean v1 of OpenAI support and #185 closes cleanly with this as the successor.

Otherwise approving the provider code itself. Will hold the merge for @rohitg00 once the scope question is resolved.

flamerged · 2026-05-12T17:21:39Z

I tested the Ollama side of this PR path against current Ollama Cloud/local behavior and I think this PR can close the Ollama use case if it adds one small passthrough.

Findings from 2026-05-12:

Direct Ollama Cloud exposes OpenAI-compatible discovery: GET https://ollama.com/v1/models returns 200 with cloud models including deepseek-v4-flash, kimi-k2.6, gpt-oss:120b, etc.
Direct native discovery also works: GET https://ollama.com/api/tags returns 200.
Local signed-in Ollama daemon exposes cloud models through the OpenAI-compatible endpoint at http://127.0.0.1:11434/v1/chat/completions.

The important edge case: for a thinking model (kimi-k2.6:cloud), this request returned reasoning but empty final content:

{
  "model": "kimi-k2.6:cloud",
  "messages": [{"role":"user","content":"Reply exactly: ok"}],
  "max_tokens": 20,
  "stream": false
}

The response had message.reasoning populated and message.content empty. The same request worked when I added either:

{"reasoning_effort":"none"}

or:

{"reasoning":{"effort":"none"}}

Native Ollama /api/chat also worked with think:false.

So for Ollama Cloud via this OpenAI-compatible provider, I think we need one optional env passthrough such as:

OPENAI_REASONING_EFFORT=none

and then include reasoning_effort in the /v1/chat/completions body when set. That would let users configure:

OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://127.0.0.1:11434/v1
OPENAI_MODEL=kimi-k2.6:cloud
OPENAI_REASONING_EFFORT=none

or direct cloud:

OPENAI_API_KEY=<ollama cloud key>
OPENAI_BASE_URL=https://ollama.com/v1
OPENAI_MODEL=kimi-k2.6
OPENAI_REASONING_EFFORT=none

Without that, this PR may work for non-thinking models but fail AgentMemory's compression/summarization path on some Ollama Cloud thinking models because the provider treats empty message.content as an unexpected response.

fatinghenji · 2026-05-12T17:40:38Z

Closing in favor of a clean rebased PR that addresses all review feedback.

…lish (#432) Patch bump per the established rule: additive surface only. OpenAI provider is a new optional surface that activates only when OPENAI_API_KEY is set, gated by OPENAI_API_KEY_FOR_LLM. Telemetry project_name pin is pure observability metadata. Compare polish is docs/website only. PRs included since v0.9.16: #307 — OpenAI-compatible LLM provider (universal adapter for OpenAI, Azure OpenAI auto-detected by hostname, DeepSeek, SiliconFlow, vLLM, LM Studio, Ollama). Plus the maintainer- pushed Azure detection + fetch timeout + README scope hint follow-ups. Closes #185, #232, #312, supersedes #240. #426 — pin worker telemetry project_name #427 — Compare section polish (title + native plugins cell + grid) Files bumped (9): package.json, packages/mcp/package.json, plugin/.claude-plugin/plugin.json, plugin/.codex-plugin/plugin.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts, CHANGELOG.md

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

rohitg00 mentioned this pull request May 8, 2026

providers: add OpenAI compatible provider #185

Closed

1 task

This was referenced May 8, 2026

Support Ollama as a native LLM provider #232

Closed

providers: consolidate openai providers #199

Closed

fatinghenji closed this May 12, 2026

fatinghenji mentioned this pull request May 12, 2026

feat: add OpenAI-compatible LLM provider #307

Merged

6 tasks

rohitg00 mentioned this pull request May 16, 2026

chore(release): v0.9.17 — OpenAI provider + telemetry id + Compare polish #432

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add OpenAI-compatible LLM provider#240

feat: add OpenAI-compatible LLM provider#240
fatinghenji wants to merge 1 commit into
rohitg00:mainfrom
fatinghenji:feat/openai-llm-provider

fatinghenji commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

rohitg00 commented May 8, 2026

Uh oh!

flamerged commented May 12, 2026

Uh oh!

fatinghenji commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		hasRealValue(env["MINIMAX_API_KEY"]) \|\|
		hasRealValue(env["OPENAI_API_KEY"])

-    hasRealValue(env["MINIMAX_API_KEY"]) ||
-    hasRealValue(env["OPENAI_API_KEY"])
+   if (
+     hasRealValue(env["ANTHROPIC_API_KEY"]) ||
+     hasRealValue(env["GEMINI_API_KEY"]) ||
+     hasRealValue(env["GOOGLE_API_KEY"]) ||
+     hasRealValue(env["OPENROUTER_API_KEY"]) ||
+     hasRealValue(env["MINIMAX_API_KEY"]) ||
+     (hasRealValue(env["OPENAI_API_KEY"]) &&
+       env["OPENAI_API_KEY_FOR_LLM"] !== "false")
+   ) {
+     return "llm";
+   }

Conversation

fatinghenji commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Supported Endpoints

Configuration Example

Backwards Compatibility

Testing

Checklist

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

rohitg00 commented May 8, 2026

Uh oh!

flamerged commented May 12, 2026

Uh oh!

fatinghenji commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fatinghenji commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading