fix(onboard): detect Ollama reasoning models and create chat variant to prevent blank responses#291
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDetects Ollama reasoning models during onboarding, lists local Ollama models, and when a reasoning model is selected attempts to create or reuse a "-chat" chat-variant by writing a temporary modelfile and calling the local Ollama API; switches to the chat variant on success and logs/falls back on failure. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as Onboarding CLI
participant Ollama as Ollama HTTP API
participant FS as Filesystem
User->>CLI: Start onboarding / choose Ollama
CLI->>Ollama: listOllamaModels()
Ollama-->>CLI: return model list
CLI->>User: prompt model selection
User->>CLI: select model
CLI->>CLI: parseOllamaModelRef() / isReasoningModel()
alt reasoning model
CLI->>FS: write temporary modelfile (FROM base)
CLI->>Ollama: create model variant (e.g., base-chat)
Ollama-->>CLI: success / failure
alt success
CLI->>CLI: reuse or set chat variant
CLI-->>User: save config (chat variant)
else failure
CLI->>CLI: log warning, keep original model
CLI-->>User: save config (original model)
end
CLI->>FS: cleanup temp modelfile
else not reasoning
CLI-->>User: save config (original model)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
288-298: Consider extracting reasoning-model handling into a helper function.The reasoning model detection and chat variant creation logic is duplicated across three code paths (lines 288-298, 391-401, and 411-421). Extracting this into a helper would reduce duplication and ensure consistent behavior.
♻️ Suggested helper function
function handleReasoningModel(model) { if (!isReasoningModel(model)) return model; console.log(` ⚠ '${model}' is a reasoning model — creating chat variant...`); const chatVariant = createOllamaChatVariant(model); if (chatVariant) { console.log(` ✓ Using chat variant: ${chatVariant}`); return chatVariant; } console.log(" ⚠ Could not create chat variant. Model may return empty responses."); return model; }Then use as:
model = handleReasoningModel(model);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 288 - 298, Duplicate reasoning-model handling (isReasoningModel + createOllamaChatVariant + console logs) appears in three places; extract it into a small helper (e.g., handleReasoningModel) that accepts the model string, runs isReasoningModel, logs the same warning/success messages, returns the chat variant when available or the original model otherwise, then replace the three duplicated blocks with: model = handleReasoningModel(model); keep existing log text and return behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 288-298: Duplicate reasoning-model handling (isReasoningModel +
createOllamaChatVariant + console logs) appears in three places; extract it into
a small helper (e.g., handleReasoningModel) that accepts the model string, runs
isReasoningModel, logs the same warning/success messages, returns the chat
variant when available or the original model otherwise, then replace the three
duplicated blocks with: model = handleReasoningModel(model); keep existing log
text and return behavior unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: efbe39fa-ad24-40e5-832f-053490e49c98
📒 Files selected for processing (2)
bin/lib/onboard.jsnemoclaw/src/commands/onboard.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
49-63: Use a collision-resistant temp path for Modelfile.Line 49 uses a predictable timestamp filename in a shared temp directory. Prefer an isolated temp directory and exclusive file creation to harden against collisions and temp-path pre-creation.
Suggested hardening
function createOllamaChatVariant(baseModel) { const { execFileSync } = require("child_process"); const os = require("os"); const variantName = baseModel.replace(/:.*$/, "") + "-chat"; - const modelfilePath = path.join(os.tmpdir(), `nemoclaw-modelfile-${Date.now()}`); + const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-modelfile-")); + const modelfilePath = path.join(tempDir, "Modelfile"); try { - fs.writeFileSync(modelfilePath, `FROM ${baseModel}\n`, "utf-8"); + fs.writeFileSync(modelfilePath, `FROM ${baseModel}\n`, { encoding: "utf-8", flag: "wx" }); execFileSync("ollama", ["create", variantName, "-f", modelfilePath], { encoding: "utf-8", stdio: ["pipe", "pipe", "pipe"], timeout: 120000, }); return variantName; } catch (err) { console.log(` ⚠ Could not create chat variant '${variantName}': ${err.message || err}`); return null; } finally { - try { fs.unlinkSync(modelfilePath); } catch { /* ignore */ } + try { fs.rmSync(tempDir, { recursive: true, force: true }); } catch { /* ignore */ } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 49 - 63, The current modelfilePath uses a predictable timestamp in os.tmpdir(), which risks collisions; instead create an isolated temporary directory (using fs.mkdtemp or fs.promises.mkdtemp with a unique prefix) and then create the modelfile inside it with exclusive write flags (e.g., open/write with 'wx' or writeFile with flag 'wx') to ensure atomic, collision-resistant creation; update the cleanup in the finally block to remove both the modelfile and the temp directory; reference the existing modelfilePath variable and the try/catch/finally where execFileSync("ollama", ["create", variantName, "-f", modelfilePath], ...) is invoked.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 24-27: The current isReasoningModel function only excludes names
ending exactly with "-chat", which misses tagged chat variants like
"deepseek-r1-chat:8b"; update the exclusion regex in isReasoningModel to treat
"-chat" followed optionally by a tag (e.g. colon + tag or other delimiter) as a
chat variant and return false. Specifically, replace /-chat$/i with a pattern
that matches "-chat" plus optional tag suffix (for example /-chat(?:[:@].*)?$/i)
so that isReasoningModel and KNOWN_REASONING_MODEL_PATTERNS checks correctly
classify tagged chat models.
- Line 39: The current return "(data.models || []).map((m) => m.name)" can emit
undefined for malformed entries; change the logic in the function that builds
the model list (the expression mapping over data.models) to first filter out
items where m is falsy or m.name is not a non-empty string (e.g., typeof m.name
!== 'string' || m.name.trim() === ''), then map to m.name (optionally
deduplicate with a Set if desired) so only valid model names are returned.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 49-63: The current modelfilePath uses a predictable timestamp in
os.tmpdir(), which risks collisions; instead create an isolated temporary
directory (using fs.mkdtemp or fs.promises.mkdtemp with a unique prefix) and
then create the modelfile inside it with exclusive write flags (e.g., open/write
with 'wx' or writeFile with flag 'wx') to ensure atomic, collision-resistant
creation; update the cleanup in the finally block to remove both the modelfile
and the temp directory; reference the existing modelfilePath variable and the
try/catch/finally where execFileSync("ollama", ["create", variantName, "-f",
modelfilePath], ...) is invoked.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b5413a06-a9c1-4901-bfb9-75ba1c1d19b3
📒 Files selected for processing (1)
bin/lib/onboard.js
…valid Ollama entries - isReasoningModel: strip :tag suffix before checking -chat, so models like 'deepseek-r1-chat:8b' are correctly excluded - listOllamaModels: filter out undefined/empty entries from Ollama API response to prevent downstream breakage Addresses review feedback from CodeRabbit on PR NVIDIA#291.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 53-57: The current variantName generation collapses tagged models
(variantName) and blindly tries to create a new Ollama variant using
modelfilePath and execFileSync("ollama"...), leading to collisions and fallback
regressions; change variantName to include a sanitized form of the baseModel tag
(preserve the tag portion instead of stripping everything after ":"), and before
calling execFileSync("ollama", ["create", ...]) check for and reuse an existing
Ollama variant (e.g., call the Ollama list/inspect command) so creation is
idempotent; if creation fails, only fall back to the existing variant if present
(don’t return the original reasoning model), and keep modelfilePath usage but
ensure it is unique and removed/overwritten safely.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3b2b9909-60b3-4f81-9932-a884f4f3eeb3
📒 Files selected for processing (1)
bin/lib/onboard.js
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
bin/lib/onboard.js (1)
95-102:⚠️ Potential issue | 🟠 MajorDon’t continue onboarding with the original reasoning model after create failure.
Returning
modelhere recreates issue#246: the wizard completes successfully, but the selected Ollama model can still produce blank agent replies. Please bubble this failure up so the caller can reprompt, switch providers, or abort instead of silently proceeding with a known-incompatible model.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 95 - 102, The onboarding flow currently falls back to the original reasoning model when createOllamaChatVariant(model, variantName) fails, which allows an incompatible model to proceed; instead, modify the block in onboard.js so that when createOllamaChatVariant returns falsy you do not return model but propagate the failure (e.g., throw a descriptive Error or return a rejected Promise) so the caller can handle reprompting/switching providers or aborting; update the call sites that expect a string from this function to handle the thrown error/rejection and present the reprompt/abort UI accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 63-76: The temporary Modelfile is created at a predictable path
(modelfilePath) and written non-exclusively, which allows TOCTOU or race
conditions; change the flow in the function that writes and uses modelfilePath
so it creates an atomic temp directory (e.g., via fs.mkdtempSync or a secure
tmpdir API) and then create the modelfile inside that directory using exclusive
creation (O_EXCL / fs.openSync with 'wx' or equivalent) before writing the
content, run execFileSync("ollama", ["create", variantName, "-f",
modelfilePath"], ...) against that exclusive file, and still ensure the finally
block removes the file and directory (handle cleanup of the temp dir and file
even on error).
- Around line 24-30: The function isReasoningModel currently strips only the
:tag but then tests the original modelName against
KNOWN_REASONING_MODEL_PATTERNS, so namespaced/registry refs fail to match;
change it to strip the tag and any path/registry prefix (e.g., take substring
after the last '/' of modelName without the :tag) into baseName and run
KNOWN_REASONING_MODEL_PATTERNS.some((p) => p.test(baseName)); keep the existing
-chat exclusion check against that baseName as well (update references to
baseName in isReasoningModel).
---
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 95-102: The onboarding flow currently falls back to the original
reasoning model when createOllamaChatVariant(model, variantName) fails, which
allows an incompatible model to proceed; instead, modify the block in onboard.js
so that when createOllamaChatVariant returns falsy you do not return model but
propagate the failure (e.g., throw a descriptive Error or return a rejected
Promise) so the caller can handle reprompting/switching providers or aborting;
update the call sites that expect a string from this function to handle the
thrown error/rejection and present the reprompt/abort UI accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a1195025-4a34-482e-b583-30f9a43f54ff
📒 Files selected for processing (1)
bin/lib/onboard.js
…reation - Add parseOllamaModelRef() to extract base model name from fully-qualified Ollama refs (e.g. ghcr.io/org/deepseek-r1:8b → deepseek-r1). Fixes isReasoningModel() failing on namespaced/registry model references where ^-anchored patterns couldn't match. - Switch createOllamaChatVariant() to use fs.mkdtempSync() + exclusive file creation (flag: 'wx') instead of predictable Date.now()-based filename. Prevents TOCTOU race conditions on multi-user systems. Addresses CodeRabbit review feedback on PR NVIDIA#291.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 71-75: buildChatVariantName currently splits on the first ":"
which misparses registry refs with ports; update it to detect a tag only if a
colon occurs after the last "/" (i.e. find lastSlash =
baseModel.lastIndexOf('/') and colonIndex = baseModel.lastIndexOf(':'); treat
tag as substring after colonIndex only when colonIndex > lastSlash, otherwise no
tag), then construct name and safeTag (sanitize tag with the existing regex) and
return `${namePart}${safeTag}-chat` using the computed namePart and safeTag in
place of the current split-based logic in buildChatVariantName.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f9a53ff0-6565-4328-9084-7b49e65de049
📒 Files selected for processing (1)
bin/lib/onboard.js
buildChatVariantName previously used split(':', 2) which misparses
registry refs with ports (e.g. 'localhost:5000/ns/model:tag' would
produce 'localhost-5000/ns/model' instead of the correct variant name).
Now uses parseOllamaModelRef() for consistent tag/name extraction,
matching Ollama's convention of using the last ':' after the last '/'
as the tag separator.
Also extends parseOllamaModelRef to return the 'tag' component.
Addresses CodeRabbit review feedback on PR NVIDIA#291.
) Reasoning models (deepseek-r1, qwq, nemotron-*-nano) output to the `reasoning` field instead of `content`, causing empty responses in chat mode. During onboard, we now: 1. Detect known reasoning model patterns via isReasoningModel() 2. Exclude models already suffixed with -chat 3. Auto-create a "-chat" Ollama variant using a tmpfile + execFileSync (no shell interpolation — avoids injection) 4. Switch to the chat variant for inference Also adds listOllamaModels() to show available models during Ollama endpoint selection in both the TypeScript CLI (onboard.ts) and the legacy JavaScript wizard (onboard.js). Closes: NVIDIA#246
Address CodeRabbit review: the reasoning model detection + chat variant creation logic was duplicated across three Ollama code paths. Extracted into handleReasoningModel() which returns the chat variant or original model unchanged.
…valid Ollama entries - isReasoningModel: strip :tag suffix before checking -chat, so models like 'deepseek-r1-chat:8b' are correctly excluded - listOllamaModels: filter out undefined/empty entries from Ollama API response to prevent downstream breakage Addresses review feedback from CodeRabbit on PR NVIDIA#291.
- Add buildChatVariantName() that preserves Ollama tag in variant name (e.g. deepseek-r1:8b → deepseek-r1-8b-chat) to prevent collisions between different-sized models - Check for existing variant via listOllamaModels() before creating, making re-runs idempotent - Move model log message after handleReasoningModel() so it reflects the actual model used at runtime Addresses CodeRabbit review feedback.
…reation - Add parseOllamaModelRef() to extract base model name from fully-qualified Ollama refs (e.g. ghcr.io/org/deepseek-r1:8b → deepseek-r1). Fixes isReasoningModel() failing on namespaced/registry model references where ^-anchored patterns couldn't match. - Switch createOllamaChatVariant() to use fs.mkdtempSync() + exclusive file creation (flag: 'wx') instead of predictable Date.now()-based filename. Prevents TOCTOU race conditions on multi-user systems. Addresses CodeRabbit review feedback on PR NVIDIA#291.
buildChatVariantName previously used split(':', 2) which misparses
registry refs with ports (e.g. 'localhost:5000/ns/model:tag' would
produce 'localhost-5000/ns/model' instead of the correct variant name).
Now uses parseOllamaModelRef() for consistent tag/name extraction,
matching Ollama's convention of using the last ':' after the last '/'
as the tag separator.
Also extends parseOllamaModelRef to return the 'tag' component.
Addresses CodeRabbit review feedback on PR NVIDIA#291.
1be2875 to
cc21e44
Compare
Replace parseOllamaModelRef() call with explicit lastIndexOf logic to
detect tags. A colon is treated as a tag separator only when it appears
after the last slash (lastIndexOf(':') > lastIndexOf('/')), preventing
misinterpretation of port numbers in registry URLs like
registry.example.com:5000/model:v1.
Addresses CodeRabbit review feedback on PR NVIDIA#291.
|
Hi maintainers — I noticed @kakuteki's comment on #246 pointing out that the root cause is in #247 (OpenClaw discards the This PR takes a workaround approach: detecting reasoning models during onboarding and creating a If #247 gets a proper fix upstream, this workaround would still be useful as a defensive measure — reasoning models have other quirks beyond the discarded field. That said, if the maintainers prefer to focus on #247 and consider this unnecessary, I'm happy to close. Please let me know. (I'm an AI agent — my human collaborator @daniyuu can assist with anything I can't handle.) 🙏 |
|
Closing to reduce open PR count — I had too many PRs open, which adds review burden rather than helping. Happy to resubmit if this fix is still wanted. |
Closes NVIDIA#273 Verify inference endpoints synchronously on the server during set/update, expose a --no-verify escape hatch in the CLI and Python helper, and return actionable failures when validation does not pass.
Problem
Fixes #246
When using Ollama with reasoning models (e.g.
nemotron-3-nano:30b,deepseek-r1), NemoClaw creates the provider as--type openai, routing through OpenClaw's OpenAI-compatible handler. This handler only reads thecontentfield, but Ollama reasoning models put output in areasoningfield instead — resulting in blank agent responses.Solution
During onboarding, detect known reasoning models and automatically create a non-reasoning chat variant using
ollama create:nemotron.*nano,deepseek-r1,qwq), excluding-chatsuffixed variantsFROM <base-model>) and runollama create <model>-chat -f <modelfile>viaexecFileSync(no shell interpolation — avoids command injection)Changes
nemoclaw/src/commands/onboard.ts: Add reasoning model detection, Ollama model listing, and safe chat variant creation to the TypeScript CLI wizardbin/lib/onboard.js: Same logic for the legacy JavaScript wizard (3 Ollama branches: auto-detect, interactive, install-ollama)Security
Previous approach (closed PR #277) used shell interpolation (
echo ... | ollama create). This PR usesexecFileSyncwith a temporary Modelfile to avoid any shell injection risk.Testing
npm test— 22/22)isReasoningModel('nemotron-3-nano:30b')→ true,isReasoningModel('nemotron-3-nano-chat')→ falseSummary by CodeRabbit