Skip to content

fix: offer Ollama install fallback when cloud API is unavailable#380

Open
futhgar wants to merge 2 commits intoNVIDIA:mainfrom
futhgar:fix/skip-api-key-local-inference
Open

fix: offer Ollama install fallback when cloud API is unavailable#380
futhgar wants to merge 2 commits intoNVIDIA:mainfrom
futhgar:fix/skip-api-key-local-inference

Conversation

@futhgar
Copy link
Copy Markdown

@futhgar futhgar commented Mar 19, 2026

Summary

  • Always show an "Install Ollama" option during onboard on Linux (not just macOS), so users have a local inference fallback when build.nvidia.com is overloaded or down
  • Use the official curl -fsSL https://ollama.com/install.sh | sh installer on Linux (Homebrew on macOS)
  • Add install-ollama as a valid NEMOCLAW_PROVIDER value for non-interactive/CI mode

Problem

When Ollama is not installed and the NVIDIA API server is unavailable, nemoclaw onboard presents only the cloud option on Linux. Users can't get an API key and onboarding fails with no fallback path (issue #301).

Fix

The install-ollama option was already implemented for macOS but gated behind process.platform === "darwin". This PR extends it to Linux with the appropriate installer command. The change is minimal — no restructuring of the onboard flow.

Fixes #301

Test plan

  • New test: verifies Install Ollama (Linux) option appears when Ollama is not installed on Linux
  • New test: verifies the curl installer (not Homebrew) is invoked on Linux
  • Existing onboard-selection.test.js tests still pass
  • All onboard/credentials/inference-config tests pass
  • npm test shows no new failures (pre-existing failures in runtime-shell.test.js and install-preflight.test.js are unrelated)

Summary by CodeRabbit

Release Notes

  • New Features
    • Added automatic Ollama installation option with platform-specific methods: brew install on macOS and curl-based installation on Linux
    • Enhanced provider selection with improved fallback behavior during non-interactive setup

Signed-off-by: Josue Balandrano Coronel josuebc@pm.me

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Extended src/lib/onboard.ts to support a new install-ollama provider option with platform-specific installation methods (brew for macOS, curl for Linux), including fallback logic when the exact provider option is unavailable in non-interactive mode.

Changes

Cohort / File(s) Summary
Install-Ollama Provider Support
src/lib/onboard.ts
Added install-ollama provider validation, implemented platform-specific installation logic (brew on macOS, curl on Linux), and introduced graceful fallback to existing ollama option when install-ollama is requested but unavailable in non-interactive mode.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hop, skip, and local install away,
When API servers fail and block the way,
Ollama installs with brew and with curl,
Cross-platform magic, let fallbacks unfurl!
No more brittle paths—graceful they dance,
A second chance for every onboard advance. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: offering an Ollama install fallback when cloud API is unavailable, which aligns with the primary objective of providing a local inference option when the NVIDIA API server is down.
Linked Issues check ✅ Passed The PR fully addresses issue #301 by adding the 'install-ollama' provider option for both macOS and Linux with platform-specific installers, enabling local inference when the NVIDIA API is unavailable.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #301: extending non-interactive provider validation, adding install-ollama option generation, updating installation logic, and supporting the fallback behavior for local inference.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 607-613: The installer calls that invoke run("brew install
ollama", { ignoreError: true }) and run("curl -fsSL
https://ollama.com/install.sh | sh", { ignoreError: true }) must not swallow
failures; update the onboarding logic so the run calls return and/or throw on
error (remove ignoreError: true) and explicitly check the result or catch
exceptions in the surrounding function (the installer branch in onboard.js) and
on failure log a clear error and abort onboarding (process.exit(1) or rethrow)
before proceeding to switch provider/model to Ollama.
- Around line 502-509: The current options builder omits the "install-ollama"
option when Ollama is present (hasOllama/ollamaRunning), causing
CI/non-interactive runs with NEMOCLAW_PROVIDER=install-ollama to fail; modify
the logic that populates options (the block referencing hasOllama, ollamaRunning
and pushing to options) to special-case the environment variable
NEMOCLAW_PROVIDER === "install-ollama" so the "install-ollama" entry is added
unconditionally when that env var is set, or alternatively relax the
non-interactive provider validation to accept "install-ollama" even if the
option wasn't pushed; update references in the same function that validate the
chosen provider to check process.env.NEMOCLAW_PROVIDER as an allowed provider.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 69e50485-aeb9-43e5-8f18-c3ea67223c15

📥 Commits

Reviewing files that changed from the base of the PR and between 9513eca and feac390.

📒 Files selected for processing (2)
  • bin/lib/onboard.js
  • test/onboard-selection.test.js

Comment thread bin/lib/onboard.js Outdated
Comment thread bin/lib/onboard.js Outdated
@wscurran wscurran added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. Local Models Running NemoClaw with local models priority: medium Issue that should be addressed in upcoming releases labels Mar 19, 2026
@futhgar futhgar force-pushed the fix/skip-api-key-local-inference branch from d94ddea to d40b6fc Compare March 28, 2026 21:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

2216-2260: Minor indentation inconsistency at Line 2226.

The sleep(2) call has 8-space indentation instead of 6, breaking alignment with the surrounding code. This doesn't affect functionality but harms readability.

🧹 Fix indentation
       run("OLLAMA_HOST=0.0.0.0:11434 ollama serve > /dev/null 2>&1 &", { ignoreError: true });
-        sleep(2);
+      sleep(2);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 2216 - 2260, In the selected.key ===
"install-ollama" branch adjust the indentation of the sleep(2); line so it
aligns with the surrounding console.log and run calls (same indentation level as
the preceding run("OLLAMA_HOST=...") and console.log lines) — replace the
8-space indent with the 6-space indent used by the block to restore consistent
formatting.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/onboard-selection.test.js`:
- Around line 1327-1333: The test is selecting option "2" (OpenAI) instead of
Install Ollama; update the credentials.prompt stub in the test so the first
prompt returns "7" (Install Ollama) instead of "2" to exercise the ollama flow
(reference credentials.prompt and setupNim()), and add a fake curl binary into a
temporary bin and ensure the spawnSync call used by the code under test sees
that bin by updating the spawnSync/env used in the test to include the temp bin
at the front of PATH (reference spawnSync and PATH) so the install-ollama path
can detect curl and proceed.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 2216-2260: In the selected.key === "install-ollama" branch adjust
the indentation of the sleep(2); line so it aligns with the surrounding
console.log and run calls (same indentation level as the preceding
run("OLLAMA_HOST=...") and console.log lines) — replace the 8-space indent with
the 6-space indent used by the block to restore consistent formatting.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 722fed8a-52cc-48e1-a9c7-c02618cb0873

📥 Commits

Reviewing files that changed from the base of the PR and between d94ddea and d40b6fc.

📒 Files selected for processing (2)
  • bin/lib/onboard.js
  • test/onboard-selection.test.js

Comment thread test/onboard-selection.test.js Outdated
kjw3 added a commit that referenced this pull request Mar 31, 2026
## Summary

Smooth out inference configuration during `install.sh` / `nemoclaw
onboard`, especially when provider authorization, credential formatting,
endpoint probing, or final inference application fail.

This PR makes the hosted-provider onboarding path recoverable instead of
brittle:
- normalize and safely handle credential input
- classify validation failures more accurately
- let users re-enter credentials in place
- make final `openshell inference set` failures recoverable
- normalize over-specified custom base URLs
- add lower-level `back` / `exit` navigation so users can move up a
level without restarting the whole install
- clarify recovery prompts with explicit commands (`retry`, `back`,
`exit`)

## What Changed

- refactored provider probe execution to use direct `curl` argv
invocation instead of `bash -c`
- normalized credential values before use/persistence
- added structured auth / transport / model / endpoint failure
classification
- added in-place credential re-entry for hosted providers:
  - NVIDIA Endpoints
  - OpenAI
  - Anthropic
  - Google Gemini
  - custom OpenAI-compatible endpoints
  - custom Anthropic-compatible endpoints
- wrapped final provider/apply failures in interactive recovery instead
of hard abort
- added command-style recovery prompts:
  - `retry`
  - `back`
  - `exit`
- allowed `back` from lower-level inference prompts (model entry, base
URL entry, recovery prompts)
- normalized custom endpoint inputs to the minimum usable base URL
- removed stale `NVIDIA Endpoints (recommended)` wording
- secret prompts now show masked `*` feedback while typing/pasting

## Validation

```bash
npx vitest run test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx vitest run test/cli.test.js
npx eslint bin/lib/credentials.js bin/lib/onboard.js test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx tsc -p jsconfig.json --noEmit
```

## Issue Mapping

Fully addressed in this PR:
- Fixes #1099
- Fixes #1101
- Fixes #1130

Substantially addressed / partially addressed:
- #987
- improves NVIDIA validation behavior and failure classification so
false/misleading connectivity failures are much less likely, but this PR
is framed as onboarding recovery hardening rather than a WSL-specific
networking fix
- #301
- improves graceful handling when validation/apply fails, especially for
transport/upstream problems, but does not add provider auto-fallback or
a broader cloud-outage fallback strategy
- #446
- improves recovery specifically for the inference-configuration step,
but does not fully solve general resumability across all onboarding
steps

Related implementation direction:
- #890
- this PR aligns with the intent of safer/non-shell probe execution and
clearer validation reporting
- #380
- not implemented here; no automatic provider fallback was added in this
branch

## Notes

- This PR intentionally does not weaken validation or reopen old
onboarding shortcuts.
- Unrelated local `tmp/` noise was left out of the branch.

Signed-off-by: Kevin Jones <kejones@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Interactive onboarding navigation (`back`/`exit`/`quit`) with
credential re-prompting and retry flows.
* Improved probe/validation flow with clearer recovery options and more
robust sandbox build progress messages.
  * Secret input masks with reliable backspace behavior.

* **Bug Fixes**
* Credential sanitization (trim/line-ending normalization) and API key
validation now normalize and retry instead of exiting.
* Better classification and messaging for authorization/validation
failures; retries where appropriate.

* **Tests**
* Expanded tests for credential prompts, masking, retry flows,
validation classification, and onboarding navigation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
laitingsheng pushed a commit that referenced this pull request Apr 2, 2026
## Summary

Smooth out inference configuration during `install.sh` / `nemoclaw
onboard`, especially when provider authorization, credential formatting,
endpoint probing, or final inference application fail.

This PR makes the hosted-provider onboarding path recoverable instead of
brittle:
- normalize and safely handle credential input
- classify validation failures more accurately
- let users re-enter credentials in place
- make final `openshell inference set` failures recoverable
- normalize over-specified custom base URLs
- add lower-level `back` / `exit` navigation so users can move up a
level without restarting the whole install
- clarify recovery prompts with explicit commands (`retry`, `back`,
`exit`)

## What Changed

- refactored provider probe execution to use direct `curl` argv
invocation instead of `bash -c`
- normalized credential values before use/persistence
- added structured auth / transport / model / endpoint failure
classification
- added in-place credential re-entry for hosted providers:
  - NVIDIA Endpoints
  - OpenAI
  - Anthropic
  - Google Gemini
  - custom OpenAI-compatible endpoints
  - custom Anthropic-compatible endpoints
- wrapped final provider/apply failures in interactive recovery instead
of hard abort
- added command-style recovery prompts:
  - `retry`
  - `back`
  - `exit`
- allowed `back` from lower-level inference prompts (model entry, base
URL entry, recovery prompts)
- normalized custom endpoint inputs to the minimum usable base URL
- removed stale `NVIDIA Endpoints (recommended)` wording
- secret prompts now show masked `*` feedback while typing/pasting

## Validation

```bash
npx vitest run test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx vitest run test/cli.test.js
npx eslint bin/lib/credentials.js bin/lib/onboard.js test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx tsc -p jsconfig.json --noEmit
```

## Issue Mapping

Fully addressed in this PR:
- Fixes #1099
- Fixes #1101
- Fixes #1130

Substantially addressed / partially addressed:
- #987
- improves NVIDIA validation behavior and failure classification so
false/misleading connectivity failures are much less likely, but this PR
is framed as onboarding recovery hardening rather than a WSL-specific
networking fix
- #301
- improves graceful handling when validation/apply fails, especially for
transport/upstream problems, but does not add provider auto-fallback or
a broader cloud-outage fallback strategy
- #446
- improves recovery specifically for the inference-configuration step,
but does not fully solve general resumability across all onboarding
steps

Related implementation direction:
- #890
- this PR aligns with the intent of safer/non-shell probe execution and
clearer validation reporting
- #380
- not implemented here; no automatic provider fallback was added in this
branch

## Notes

- This PR intentionally does not weaken validation or reopen old
onboarding shortcuts.
- Unrelated local `tmp/` noise was left out of the branch.

Signed-off-by: Kevin Jones <kejones@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Interactive onboarding navigation (`back`/`exit`/`quit`) with
credential re-prompting and retry flows.
* Improved probe/validation flow with clearer recovery options and more
robust sandbox build progress messages.
  * Secret input masks with reliable backspace behavior.

* **Bug Fixes**
* Credential sanitization (trim/line-ending normalization) and API key
validation now normalize and retry instead of exiting.
* Better classification and messaging for authorization/validation
failures; retries where appropriate.

* **Tests**
* Expanded tests for credential prompts, masking, retry flows,
validation classification, and onboarding navigation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
…#1136)

## Summary

Smooth out inference configuration during `install.sh` / `nemoclaw
onboard`, especially when provider authorization, credential formatting,
endpoint probing, or final inference application fail.

This PR makes the hosted-provider onboarding path recoverable instead of
brittle:
- normalize and safely handle credential input
- classify validation failures more accurately
- let users re-enter credentials in place
- make final `openshell inference set` failures recoverable
- normalize over-specified custom base URLs
- add lower-level `back` / `exit` navigation so users can move up a
level without restarting the whole install
- clarify recovery prompts with explicit commands (`retry`, `back`,
`exit`)

## What Changed

- refactored provider probe execution to use direct `curl` argv
invocation instead of `bash -c`
- normalized credential values before use/persistence
- added structured auth / transport / model / endpoint failure
classification
- added in-place credential re-entry for hosted providers:
  - NVIDIA Endpoints
  - OpenAI
  - Anthropic
  - Google Gemini
  - custom OpenAI-compatible endpoints
  - custom Anthropic-compatible endpoints
- wrapped final provider/apply failures in interactive recovery instead
of hard abort
- added command-style recovery prompts:
  - `retry`
  - `back`
  - `exit`
- allowed `back` from lower-level inference prompts (model entry, base
URL entry, recovery prompts)
- normalized custom endpoint inputs to the minimum usable base URL
- removed stale `NVIDIA Endpoints (recommended)` wording
- secret prompts now show masked `*` feedback while typing/pasting

## Validation

```bash
npx vitest run test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx vitest run test/cli.test.js
npx eslint bin/lib/credentials.js bin/lib/onboard.js test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx tsc -p jsconfig.json --noEmit
```

## Issue Mapping

Fully addressed in this PR:
- Fixes NVIDIA#1099
- Fixes NVIDIA#1101
- Fixes NVIDIA#1130

Substantially addressed / partially addressed:
- NVIDIA#987
- improves NVIDIA validation behavior and failure classification so
false/misleading connectivity failures are much less likely, but this PR
is framed as onboarding recovery hardening rather than a WSL-specific
networking fix
- NVIDIA#301
- improves graceful handling when validation/apply fails, especially for
transport/upstream problems, but does not add provider auto-fallback or
a broader cloud-outage fallback strategy
- NVIDIA#446
- improves recovery specifically for the inference-configuration step,
but does not fully solve general resumability across all onboarding
steps

Related implementation direction:
- NVIDIA#890
- this PR aligns with the intent of safer/non-shell probe execution and
clearer validation reporting
- NVIDIA#380
- not implemented here; no automatic provider fallback was added in this
branch

## Notes

- This PR intentionally does not weaken validation or reopen old
onboarding shortcuts.
- Unrelated local `tmp/` noise was left out of the branch.

Signed-off-by: Kevin Jones <kejones@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Interactive onboarding navigation (`back`/`exit`/`quit`) with
credential re-prompting and retry flows.
* Improved probe/validation flow with clearer recovery options and more
robust sandbox build progress messages.
  * Secret input masks with reliable backspace behavior.

* **Bug Fixes**
* Credential sanitization (trim/line-ending normalization) and API key
validation now normalize and retry instead of exiting.
* Better classification and messaging for authorization/validation
failures; retries where appropriate.

* **Tests**
* Expanded tests for credential prompts, masking, retry flows,
validation classification, and onboarding navigation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added the status: needs-info For issues/PRs that lack a description. (Signals to the author that more detail is required). label Apr 14, 2026
When Ollama is not installed, the onboard wizard previously only offered
the NVIDIA Cloud API option on Linux. If build.nvidia.com was overloaded
or down, users couldn't obtain an API key and onboarding would fail with
no fallback.

Now, an "Install Ollama" option is always shown on both Linux and macOS
when Ollama is not already present. On Linux, the official curl installer
(ollama.com/install.sh) is used instead of Homebrew. This ensures users
always have a local inference path available regardless of NVIDIA API
server availability.

Also adds install-ollama as a valid NEMOCLAW_PROVIDER value for
non-interactive mode.

Fixes NVIDIA#301

Signed-off-by: Josue Gomez <josue@guatulab.com>
@futhgar futhgar force-pushed the fix/skip-api-key-local-inference branch from 8e84410 to 244d62e Compare April 14, 2026 01:48
@futhgar
Copy link
Copy Markdown
Author

futhgar commented Apr 14, 2026

Rebased against main and ported to TypeScript per review feedback. All three original commits were squashed into one since the follow-up fixes (non-interactive fallback, ignoreError removal, test index correction) were incorporated directly into the TypeScript port.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard.ts (1)

3306-3366: ⚠️ Potential issue | 🟠 Major

Reuse the existing Ollama startup path here.

This copy of the ollama branch already diverged: it dropped the isWsl() bind workaround, so WSL users who take the new Linux install path will start Ollama on 0.0.0.0 and hit the host-gateway reachability problem the original branch avoids. Extract the shared “start/pick/pull/validate Ollama” flow and keep only the install step platform-specific.

As per coding guidelines, "Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3306 - 3366, The install-ollama branch
duplicated the entire Ollama start/pick/pull/validate flow and dropped the
isWsl() bind workaround, increasing complexity and breaking WSL; extract the
shared startup/selection/validation logic into a reusable function (e.g.,
startAndValidateOllama or ensureLocalOllamaReady) and keep the platform-specific
part only for installation steps inside the selected.key === "install-ollama"
branch; ensure the new function uses the existing isWsl() check when building
the run command for starting Ollama (use host vs loopback bind behavior as in
the original flow), reuses symbols like OLLAMA_PORT,
getLocalProviderBaseUrl/getLocalProviderValidationBaseUrl,
getOllamaModelOptions, prepareOllamaModel, validateOpenAiLikeSelection,
promptOllamaModel, requestedModel, BACK_TO_SELECTION, and sets
preferredInferenceApi = "openai-completions" before returning — this both
restores the WSL bind workaround and reduces cyclomatic complexity by moving the
while loop and validation into the shared function.
🧹 Nitpick comments (1)
src/lib/onboard.ts (1)

2880-2890: Add coverage for the silent install-ollama fallback.

This branch only runs in non-interactive mode, and the existing Linux onboarding test in test/onboard-selection.test.ts only covers the “Ollama not installed” case. A small regression test for NEMOCLAW_PROVIDER=install-ollama when Ollama is already present would lock in the new CI behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 2880 - 2890, Add a test that exercises the
non-interactive "install-ollama" fallback path when Ollama is already installed:
in test/onboard-selection.test.ts add a case that sets
NEMOCLAW_PROVIDER=install-ollama, mocks/simulates Ollama as present (so the
options list includes a provider with key "ollama"), runs the onboarding
selection flow that uses providerKey and the options.find((o) => o.key ===
"ollama") branch, and asserts the code chooses the existing "ollama" provider
(no process.exit and no error logged). Ensure the test invokes the same
selection entrypoint used by the onboarding code so it hits the branch where
providerKey === "install-ollama" and verifies the silent fallback behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1777-1782: The resume conflict arises because setupNim() maps both
"ollama" and "install-ollama" to provider: "ollama-local" but
getEffectiveProviderName() doesn't normalize the "install-ollama" input, causing
getResumeConfigConflicts() to see a mismatch; update getEffectiveProviderName()
to treat the normalized input "install-ollama" the same as "ollama" (i.e.,
return "ollama-local" for either), or add "install-ollama" to the same
mapping/validProviders handling used for normalization so resumed runs with
NEMOCLAW_PROVIDER=install-ollama match the recorded "ollama-local".
- Around line 3311-3312: The piped installer call in onboard.ts using run("curl
-fsSL https://ollama.com/install.sh | sh") can hide curl failures because run()
spawns bash without pipefail; update the call or the run helper so the shell
enables pipefail. Either prepend "set -o pipefail; " (or "set -euo pipefail; ")
to the cmd string passed to run() where Ollama is installed, or modify the run()
implementation in src/lib/runner.ts (the run function that calls
spawnSync("bash", ["-c", cmd])) to always invoke bash with pipefail enabled
(e.g., ensure every "-c" execution sources "set -o pipefail" before the command)
so curl failures cause a non-zero exit.

---

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 3306-3366: The install-ollama branch duplicated the entire Ollama
start/pick/pull/validate flow and dropped the isWsl() bind workaround,
increasing complexity and breaking WSL; extract the shared
startup/selection/validation logic into a reusable function (e.g.,
startAndValidateOllama or ensureLocalOllamaReady) and keep the platform-specific
part only for installation steps inside the selected.key === "install-ollama"
branch; ensure the new function uses the existing isWsl() check when building
the run command for starting Ollama (use host vs loopback bind behavior as in
the original flow), reuses symbols like OLLAMA_PORT,
getLocalProviderBaseUrl/getLocalProviderValidationBaseUrl,
getOllamaModelOptions, prepareOllamaModel, validateOpenAiLikeSelection,
promptOllamaModel, requestedModel, BACK_TO_SELECTION, and sets
preferredInferenceApi = "openai-completions" before returning — this both
restores the WSL bind workaround and reduces cyclomatic complexity by moving the
while loop and validation into the shared function.

---

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 2880-2890: Add a test that exercises the non-interactive
"install-ollama" fallback path when Ollama is already installed: in
test/onboard-selection.test.ts add a case that sets
NEMOCLAW_PROVIDER=install-ollama, mocks/simulates Ollama as present (so the
options list includes a provider with key "ollama"), runs the onboarding
selection flow that uses providerKey and the options.find((o) => o.key ===
"ollama") branch, and asserts the code chooses the existing "ollama" provider
(no process.exit and no error logged). Ensure the test invokes the same
selection entrypoint used by the onboarding code so it hits the branch where
providerKey === "install-ollama" and verifies the silent fallback behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0924340b-f264-4e5d-bef4-d058406d6406

📥 Commits

Reviewing files that changed from the base of the PR and between 8e84410 and 244d62e.

📒 Files selected for processing (1)
  • src/lib/onboard.ts

Comment thread src/lib/onboard.ts
Comment on lines +1777 to +1782
"install-ollama",
]);
if (!validProviders.has(normalized)) {
console.error(` Unsupported NEMOCLAW_PROVIDER: ${providerKey}`);
console.error(
" Valid values: build, openai, anthropic, anthropicCompatible, gemini, ollama, custom, nim-local, vllm",
" Valid values: build, openai, anthropic, anthropicCompatible, gemini, ollama, custom, nim-local, vllm, install-ollama",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Map install-ollama to ollama-local for resume checks.

setupNim() records both ollama and install-ollama as provider: "ollama-local", but getEffectiveProviderName() does not normalize the new key. That makes getResumeConfigConflicts() reject a resumed non-interactive run with NEMOCLAW_PROVIDER=install-ollama against a session that already recorded ollama-local.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 1777 - 1782, The resume conflict arises
because setupNim() maps both "ollama" and "install-ollama" to provider:
"ollama-local" but getEffectiveProviderName() doesn't normalize the
"install-ollama" input, causing getResumeConfigConflicts() to see a mismatch;
update getEffectiveProviderName() to treat the normalized input "install-ollama"
the same as "ollama" (i.e., return "ollama-local" for either), or add
"install-ollama" to the same mapping/validProviders handling used for
normalization so resumed runs with NEMOCLAW_PROVIDER=install-ollama match the
recorded "ollama-local".

Comment thread src/lib/onboard.ts Outdated
Comment on lines +3311 to +3312
console.log(" Installing Ollama via official installer...");
run("curl -fsSL https://ollama.com/install.sh | sh");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

RUNNER_FILE="$(fd -i '^runner\.(ts|js)$' src/lib | head -n 1)"
if [[ -z "${RUNNER_FILE}" ]]; then
  echo "Could not locate src/lib/runner.{ts,js}" >&2
  exit 1
fi

echo "Inspecting ${RUNNER_FILE} for run()/runCapture() shell invocation details..."
rg -n -C3 'pipefail|function run\b|const run\b|function runCapture\b|const runCapture\b|spawn(Sync)?\b|exec(File|Sync)?\b|bash|sh -lc|shell:' "$RUNNER_FILE"

echo
echo "Demonstrating default bash pipeline status vs pipefail:"
bash -lc 'false | cat >/dev/null; echo "default pipeline exit: $?"'
bash -o pipefail -lc 'false | cat >/dev/null; echo "pipefail pipeline exit: $?"'

Repository: NVIDIA/NemoClaw

Length of output: 2092


Fix the installer command to detect curl failures.

The run() helper in src/lib/runner.ts uses spawnSync("bash", ["-c", cmd]) without set -o pipefail. This means the piped installer command can exit successfully even if curl fails with a download/TLS/HTTP error, because the pipeline will report sh's status instead. The missing curl output leads sh to execute an empty script, exiting 0.

Recommended fixes:

  • Add set -o pipefail at the start of the cmd string passed to run()
  • Or modify the run() helper to enforce pipefail for all bash invocations
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3311 - 3312, The piped installer call in
onboard.ts using run("curl -fsSL https://ollama.com/install.sh | sh") can hide
curl failures because run() spawns bash without pipefail; update the call or the
run helper so the shell enables pipefail. Either prepend "set -o pipefail; " (or
"set -euo pipefail; ") to the cmd string passed to run() where Ollama is
installed, or modify the run() implementation in src/lib/runner.ts (the run
function that calls spawnSync("bash", ["-c", cmd])) to always invoke bash with
pipefail enabled (e.g., ensure every "-c" execution sources "set -o pipefail"
before the command) so curl failures cause a non-zero exit.

…il to curl installer

Addresses CodeRabbit review feedback:
- getEffectiveProviderName() now maps 'install-ollama' to 'ollama-local'
  so resume checks don't reject NEMOCLAW_PROVIDER=install-ollama against
  sessions that recorded ollama-local
- curl | sh installer now uses 'set -o pipefail' so curl failures are
  detected instead of silently passing empty input to sh
@wscurran wscurran added status: rebase PR needs to be rebased against main before review can continue and removed status: needs-info For issues/PRs that lack a description. (Signals to the author that more detail is required). labels Apr 14, 2026
gemini2026 pushed a commit to gemini2026/NemoClaw that referenced this pull request Apr 14, 2026
…#1136)

## Summary

Smooth out inference configuration during `install.sh` / `nemoclaw
onboard`, especially when provider authorization, credential formatting,
endpoint probing, or final inference application fail.

This PR makes the hosted-provider onboarding path recoverable instead of
brittle:
- normalize and safely handle credential input
- classify validation failures more accurately
- let users re-enter credentials in place
- make final `openshell inference set` failures recoverable
- normalize over-specified custom base URLs
- add lower-level `back` / `exit` navigation so users can move up a
level without restarting the whole install
- clarify recovery prompts with explicit commands (`retry`, `back`,
`exit`)

## What Changed

- refactored provider probe execution to use direct `curl` argv
invocation instead of `bash -c`
- normalized credential values before use/persistence
- added structured auth / transport / model / endpoint failure
classification
- added in-place credential re-entry for hosted providers:
  - NVIDIA Endpoints
  - OpenAI
  - Anthropic
  - Google Gemini
  - custom OpenAI-compatible endpoints
  - custom Anthropic-compatible endpoints
- wrapped final provider/apply failures in interactive recovery instead
of hard abort
- added command-style recovery prompts:
  - `retry`
  - `back`
  - `exit`
- allowed `back` from lower-level inference prompts (model entry, base
URL entry, recovery prompts)
- normalized custom endpoint inputs to the minimum usable base URL
- removed stale `NVIDIA Endpoints (recommended)` wording
- secret prompts now show masked `*` feedback while typing/pasting

## Validation

```bash
npx vitest run test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx vitest run test/cli.test.js
npx eslint bin/lib/credentials.js bin/lib/onboard.js test/credentials.test.js test/onboard-selection.test.js test/onboard.test.js
npx tsc -p jsconfig.json --noEmit
```

## Issue Mapping

Fully addressed in this PR:
- Fixes NVIDIA#1099
- Fixes NVIDIA#1101
- Fixes NVIDIA#1130

Substantially addressed / partially addressed:
- NVIDIA#987
- improves NVIDIA validation behavior and failure classification so
false/misleading connectivity failures are much less likely, but this PR
is framed as onboarding recovery hardening rather than a WSL-specific
networking fix
- NVIDIA#301
- improves graceful handling when validation/apply fails, especially for
transport/upstream problems, but does not add provider auto-fallback or
a broader cloud-outage fallback strategy
- NVIDIA#446
- improves recovery specifically for the inference-configuration step,
but does not fully solve general resumability across all onboarding
steps

Related implementation direction:
- NVIDIA#890
- this PR aligns with the intent of safer/non-shell probe execution and
clearer validation reporting
- NVIDIA#380
- not implemented here; no automatic provider fallback was added in this
branch

## Notes

- This PR intentionally does not weaken validation or reopen old
onboarding shortcuts.
- Unrelated local `tmp/` noise was left out of the branch.

Signed-off-by: Kevin Jones <kejones@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Interactive onboarding navigation (`back`/`exit`/`quit`) with
credential re-prompting and retry flows.
* Improved probe/validation flow with clearer recovery options and more
robust sandbox build progress messages.
  * Secret input masks with reliable backspace behavior.

* **Bug Fixes**
* Credential sanitization (trim/line-ending normalization) and API key
validation now normalize and retry instead of exiting.
* Better classification and messaging for authorization/validation
failures; retries where appropriate.

* **Tests**
* Expanded tests for credential prompts, masking, retry flows,
validation classification, and onboarding navigation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran removed the status: rebase PR needs to be rebased against main before review can continue label Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. Local Models Running NemoClaw with local models priority: medium Issue that should be addressed in upcoming releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Install fails because API server fails and there is no graceful fallback

2 participants