Local Ollama inference routing fails from sandbox

# NemoClaw: Local Ollama inference routing fails from sandbox

**Environment:** DGX Spark (GB10, aarch64), Ubuntu 24.04, OpenShell 0.0.10, NemoClaw 0.0.x, OpenClaw 2026.3.11, Ollama 0.18.0 (snap), Docker 28.x w/ nvidia runtime, cgroup v2 host mode.

**Goal:** Route sandbox inference to local Ollama (nemotron-3-super:120b) on host port 11434 instead of NVIDIA NIM cloud.

**Confirmed working:** NVIDIA NIM cloud inference via `nvidia` provider type works end-to-end. `curl` from sandbox to `https://inference.local/v1/chat/completions` returns valid completions when provider is `nvidia-nim`. Ollama OpenAI-compat endpoint works from host: `curl http://localhost:11434/v1/chat/completions` returns valid JSON.

---

## Attempt 1: `openai` provider type with baseUrl config

```
openshell provider create --name ollama-local --type openai \
  --credential OPENAI_API_KEY=ollama --config baseUrl=http://localhost:11434/v1
openshell inference set --provider ollama-local --model nemotron-3-super:120b
```

**Result:** Verification step hits `https://api.openai.com/v1` instead of configured baseUrl. Returns 401 from OpenAI. The `baseUrl` config key is stored (`openshell provider get` shows `Config keys: baseUrl`) but ignored during both verification and runtime routing. Using `--no-verify` bypasses verification but sandbox `curl` to `inference.local` still returns OpenAI 401 — confirming gateway hardcodes `api.openai.com` for `openai` type.

## Attempt 2: `nvidia` provider type with baseUrl config

```
openshell provider create --name ollama-local --type nvidia \
  --credential NVIDIA_API_KEY=ollama --config baseUrl=http://localhost:11434/v1
openshell inference set --provider ollama-local --model nemotron-3-super:120b --no-verify
```

**Result:** Inference route accepted (version incremented). Sandbox `curl` to `https://inference.local/v1/chat/completions` returns `404 Unknown`. Verbose curl shows TLS proxy connect to `10.200.0.1:3128` succeeds (HTTP 200 Connection Established) then backend returns 404. Gateway is reaching *something* but not Ollama. Tried `base_url` (underscore) and `endpoint` config keys — same 404.

## Attempt 3: `generic` provider type

```
openshell provider create --name ollama-local --type generic \
  --config baseUrl=http://localhost:11434/v1 --credential API_KEY=ollama
```

**Result:** Provider created successfully. But `openshell inference set` rejects it: `provider 'ollama-local' has unsupported type 'generic' for cluster inference (supported: openai, anthropic, nvidia)`.

## Attempt 4: Direct sandbox→host via network policy

Added `ollama_local` network policy entry to sandbox policy YAML:

```yaml
ollama_local:
  name: ollama_local
  endpoints:
    - host: 172.17.0.1
      port: 11434
      protocol: rest
      enforcement: enforce
      rules:
        - allow: { method: "*", path: "/**" }
  binaries:
    - { path: /usr/local/bin/openclaw }
    - { path: /usr/bin/curl }
```

Applied via `openshell policy set --policy ~/ollama-policy.yaml ultra-ops --wait` — policy v2 accepted and loaded.

**Result (through proxy):** Sandbox env has `http_proxy=http://10.200.0.1:3128`. All HTTP goes through OpenShell gateway proxy. Proxy returns `HTTP/1.1 403 Forbidden` for `http://172.17.0.1:11434`. Policy allows the endpoint but proxy blocks it — likely because it's plain HTTP (no TLS) and/or the proxy has an independent allowlist.

**Result (bypassing proxy):** `curl --noproxy "*" http://172.17.0.1:11434` returns `Connection refused`. Sandbox network namespace cannot reach Docker bridge IP directly — only through the proxy. Confirmed sandbox egress IP is `10.200.0.2`, route via `10.200.0.1` (gateway proxy).

**Interesting:** `curl -s http://172.17.0.1:11434` (through proxy, no path) returns empty 200 — the "Ollama is running" response. So the proxy *can* reach Ollama for simple GET. But POST to `/v1/chat/completions` gets 403.

---

## Root cause assessment

1. Provider types (`openai`, `nvidia`) hardcode upstream base URLs. The `baseUrl`/`base_url`/`endpoint` config keys are stored but not used for inference routing. The gateway always routes to the canonical endpoint for the provider type.

2. `generic` provider type exists but is excluded from the inference routing system (`supported: openai, anthropic, nvidia`).

3. Sandbox HTTP proxy (`10.200.0.1:3128`) blocks non-TLS POST requests to internal hosts even when network policy explicitly allows the endpoint. GET succeeds, POST returns 403.

4. Net result: no path exists in OpenShell 0.0.10 to route `inference.local` to a local Ollama instance.

## Feature request

Support a provider type (or extend `generic`) that:
- Accepts a user-defined `baseUrl` for inference routing
- Routes `inference.local` proxy traffic to that URL
- Works with plain HTTP endpoints on the host (Ollama, vLLM, llama.cpp server)

This is the primary use case for DGX Spark users running NemoClaw with local models.

## Workaround

Use `nvidia-nim` provider with NVIDIA cloud inference. Works but burns finite API credits and adds 200-500ms latency vs local.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local Ollama inference routing fails from sandbox #385

NemoClaw: Local Ollama inference routing fails from sandbox

Attempt 1: `openai` provider type with baseUrl config

Attempt 2: `nvidia` provider type with baseUrl config

Attempt 3: `generic` provider type

Attempt 4: Direct sandbox→host via network policy

Root cause assessment

Feature request

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Local Ollama inference routing fails from sandbox #385

Description

NemoClaw: Local Ollama inference routing fails from sandbox

Attempt 1: openai provider type with baseUrl config

Attempt 2: nvidia provider type with baseUrl config

Attempt 3: generic provider type

Attempt 4: Direct sandbox→host via network policy

Root cause assessment

Feature request

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Attempt 1: `openai` provider type with baseUrl config

Attempt 2: `nvidia` provider type with baseUrl config

Attempt 3: `generic` provider type