Agent Diagnostic
- Loaded
debug-inference and openshell-cli skills
- Traced the inference routing path through the codebase:
proxy.rs:351 intercepts inference.local → route_inference_request() → Router::proxy_with_candidates_streaming() → reqwest::Client → upstream at host.openshell.internal:11434
- Confirmed the SSRF check at
proxy.rs:480-601 does NOT apply to managed inference routes (the inference.local interception returns early at line 374)
- Traced host gateway IP detection in
cluster-entrypoint.sh:397-415: resolves host.docker.internal via getent ahostsv4, falls back to container default route
- On Docker Desktop + WSL2,
host.docker.internal resolves to either IPv6 ULA (fdc4:...) or 172.29.0.254 (Docker Desktop gateway) — both unreachable from inside k3s pods
- On Docker Engine + WSL2, falls back to
172.17.0.1 (docker0 bridge) which has asymmetric routing and times out
- The bad IP propagates:
cluster-entrypoint.sh → __HOST_GATEWAY_IP__ in HelmChart → hostGatewayIP Helm value → StatefulSet hostAliases → gateway pod /etc/hosts → router reqwest::Client DNS resolution → connection failure
- The router's
reqwest::Client (openshell-router/src/lib.rs:39-41) has no IP filtering — once the resolved IP is reachable, inference works
Description
When running OpenShell on Docker Desktop + WSL2 (or Docker Engine on WSL2), host.openshell.internal resolves to an unreachable IP address. This breaks any feature that depends on reaching host services from inside the cluster, most notably local inference routing (e.g., Ollama at host.openshell.internal:11434).
The root cause is in deploy/docker/cluster-entrypoint.sh lines 397-415. The detection logic:
- Tries
getent ahostsv4 host.docker.internal — on Docker Desktop/WSL2 this returns an IPv6 ULA or unreachable gateway IP
- Falls back to
ip -4 route | awk '/default/ { print $3 }' — on Docker Engine/WSL2 this returns the docker0 bridge IP which has asymmetric routing
Neither produces a usable IPv4 address that pods can reach.
Expected: host.openshell.internal should resolve to an IP where the host's services (e.g., Ollama on port 11434) are actually reachable.
Actual: The resolved IP is either IPv6, unreachable, or has broken routing. The router's upstream connection fails with RouterError::UpstreamUnavailable.
Reproduction Steps
- Run on WSL2 with Docker Desktop (Windows 11)
- Start Ollama on the host:
OLLAMA_HOST=0.0.0.0:11434 ollama serve
- Start the gateway:
openshell gateway start
- Create a provider and set inference:
openshell provider create --name ollama --type openai \
--credential OPENAI_API_KEY=empty \
--config OPENAI_BASE_URL=http://host.openshell.internal:11434/v1
openshell inference set --no-verify --provider ollama --model qwen2.5-coder:3b
- Create a sandbox and test inference:
openshell sandbox create -- bash
# Inside sandbox:
curl -sS https://inference.local/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"qwen2.5-coder:3b","messages":[{"role":"user","content":"say hello"}]}'
- Result: timeout or
{"error":"upstream unavailable"} — the router cannot connect to host.openshell.internal:11434
Workaround: Pass the WSL2 eth0 IP directly as OPENAI_BASE_URL instead of host.openshell.internal:
openshell provider create --name ollama --type openai \
--credential OPENAI_API_KEY=empty \
--config OPENAI_BASE_URL=http://$(hostname -I | awk '{print $1}'):11434/v1
Environment
- OS: Windows 11 + WSL2 (Ubuntu, kernel 6.6.87.2)
- Docker: Docker Desktop 4.x with WSL2 backend
- OpenShell: current main
- Also affects: Docker Engine running inside WSL2
Logs
# Inside the cluster container:
$ getent ahostsv4 host.docker.internal
# Returns empty or IPv6 on Docker Desktop/WSL2
# Gateway pod logs show:
INFO openshell_router: routing proxy inference request (streaming)
# Followed by upstream connection failure (no NET:OPEN, just timeout)
Proposed Fix
After the existing detection in cluster-entrypoint.sh:415, add:
- IPv6 rejection — if detected IP contains
:, discard and try fallbacks
- WSL2 eth0 fallback — try
ip -4 addr show eth0 for the WSL2 distro IP
- Environment variable override — accept
OPENSHELL_HOST_GATEWAY_IP_OVERRIDE
This is additive and does not affect platforms where the existing detection works (macOS Docker Desktop, native Linux).
Related: #681 (WSL2 proxy issues, different root cause), #642 (sandbox networking on WSL2)
Agent-First Checklist
Agent Diagnostic
debug-inferenceandopenshell-cliskillsproxy.rs:351interceptsinference.local→route_inference_request()→Router::proxy_with_candidates_streaming()→reqwest::Client→ upstream athost.openshell.internal:11434proxy.rs:480-601does NOT apply to managed inference routes (theinference.localinterception returns early at line 374)cluster-entrypoint.sh:397-415: resolveshost.docker.internalviagetent ahostsv4, falls back to container default routehost.docker.internalresolves to either IPv6 ULA (fdc4:...) or172.29.0.254(Docker Desktop gateway) — both unreachable from inside k3s pods172.17.0.1(docker0 bridge) which has asymmetric routing and times outcluster-entrypoint.sh→__HOST_GATEWAY_IP__in HelmChart →hostGatewayIPHelm value → StatefulSethostAliases→ gateway pod/etc/hosts→ routerreqwest::ClientDNS resolution → connection failurereqwest::Client(openshell-router/src/lib.rs:39-41) has no IP filtering — once the resolved IP is reachable, inference worksDescription
When running OpenShell on Docker Desktop + WSL2 (or Docker Engine on WSL2),
host.openshell.internalresolves to an unreachable IP address. This breaks any feature that depends on reaching host services from inside the cluster, most notably local inference routing (e.g., Ollama athost.openshell.internal:11434).The root cause is in
deploy/docker/cluster-entrypoint.shlines 397-415. The detection logic:getent ahostsv4 host.docker.internal— on Docker Desktop/WSL2 this returns an IPv6 ULA or unreachable gateway IPip -4 route | awk '/default/ { print $3 }'— on Docker Engine/WSL2 this returns the docker0 bridge IP which has asymmetric routingNeither produces a usable IPv4 address that pods can reach.
Expected:
host.openshell.internalshould resolve to an IP where the host's services (e.g., Ollama on port 11434) are actually reachable.Actual: The resolved IP is either IPv6, unreachable, or has broken routing. The router's upstream connection fails with
RouterError::UpstreamUnavailable.Reproduction Steps
OLLAMA_HOST=0.0.0.0:11434 ollama serveopenshell gateway start{"error":"upstream unavailable"}— the router cannot connect tohost.openshell.internal:11434Workaround: Pass the WSL2 eth0 IP directly as
OPENAI_BASE_URLinstead ofhost.openshell.internal:Environment
Logs
Proposed Fix
After the existing detection in
cluster-entrypoint.sh:415, add::, discard and try fallbacksip -4 addr show eth0for the WSL2 distro IPOPENSHELL_HOST_GATEWAY_IP_OVERRIDEThis is additive and does not affect platforms where the existing detection works (macOS Docker Desktop, native Linux).
Related: #681 (WSL2 proxy issues, different root cause), #642 (sandbox networking on WSL2)
Agent-First Checklist
debug-openshell-cluster,debug-inference,openshell-cli)