mostlydev · mostlydev · Apr 10, 2026 · Apr 9, 2026 · Apr 9, 2026
diff --git a/docs/CLLAMA_SPEC.md b/docs/CLLAMA_SPEC.md
@@ -16,10 +16,19 @@ This document defines the contract between Clawdapus (the orchestrator) and a `c
 
 ## 2. API Contract
 
-A `cllama` sidecar MUST expose an HTTP API compatible with the OpenAI Chat Completions API (`POST /v1/chat/completions`).
+A `cllama` sidecar MUST expose a canonical ingress surface matrix for runner traffic.
+
+Minimum required surfaces:
+
+| Surface | Path | Payload family | Default use |
+|---|---|---|---|
+| OpenAI Chat Completions | `POST /v1/chat/completions` | OpenAI-compatible chat/completions | All non-Anthropic providers unless an explicit exception is documented |
+| Anthropic Messages | `POST /v1/messages` | Anthropic Messages | Anthropic-family providers and explicit Anthropic-wire exceptions |
 
 - **Listen Port:** The proxy MUST listen on `0.0.0.0:8080`.
-- **Base URL Replacement:** Clawdapus configures the agent's runner (e.g., OpenClaw, Claude Code) to use `http://cllama-<type>:8080/v1` as its LLM base URL (first proxy in chain when chaining is enabled).
+- **Base URL Replacement:** Clawdapus configures the agent's runner (e.g., OpenClaw, Claude Code) to use `http://cllama-<type>:8080/v1` as its LLM base URL (first proxy in chain when chaining is enabled). The runner then targets one of the canonical ingress paths beneath that base URL.
+- **Provider Identity vs Transport:** Operator-facing model refs keep provider identity (`google/gemini-*`, `anthropic/*`, etc.). The proxy ingress surface is a transport contract selected by infrastructure; runners MUST NOT invent synthetic provider prefixes such as `cllama/google`, and the shared ingress contract rejects them when compiling cllama-facing config.
+- **Vendor-Native Extensions:** Additional vendor-native ingress surfaces MAY exist, but only as explicit, documented exceptions when a concrete runner cannot target the canonical surfaces. They are not the default contract.
 - **Implementation Scope (Phase 4):** The wire protocol supports chained proxies, but runtime enforcement currently allows only one proxy type per pod. Declaring multiple proxy types fails fast until Phase 5 chain execution is implemented.
 
 ## 3. Context Injection (The Environment & Shared Mounts)
@@ -161,7 +170,7 @@ Phase 1 is retention only. cllama writes the history; no read API exists. Agents
 Clawdapus provides a reference image: `ghcr.io/mostlydev/cllama`.
 
 The passthrough reference:
-- Adheres to the v1 HTTP API and Listen Port.
+- Adheres to the v1 ingress surface matrix and Listen Port.
 - Validates the environment (`CLAW_POD`, `CLAW_CONTEXT_ROOT`, provider credentials), bearer-token identity resolution, and mounts.
 - Acts as a pure, transparent proxy (no decoration, no amendment).
 - Emits structured logs of all traffic.

diff --git a/docs/decisions/008-cllama-sidecar-standard.md b/docs/decisions/008-cllama-sidecar-standard.md
@@ -12,7 +12,7 @@ Initially, `cllama` was conceived as a specific proxy component injected by Claw
 We formalize `cllama` as a **mini-standard** rather than a single hardcoded implementation.
 
 1. **The cllama Contract:** A `cllama` sidecar is any container image that:
-   - Exposes an OpenAI-compatible proxy endpoint.
+   - Exposes the cllama ingress surface matrix. The minimum required surfaces are OpenAI Chat Completions (`POST /v1/chat/completions`) and Anthropic Messages (`POST /v1/messages`).
    - Accepts Clawdapus orchestration context (e.g., agent identity, loaded policy modules, capability labels, and the behavioral contract) injected via environment variables or volume mounts by the Clawdapus pod emitter.
    - Emits standardized logs or labels back to Clawdapus for audit and drift scoring.
 2. **Identity and Authorization Awareness:** The Clawdapus driver injects pod-level context and shared per-agent mounts. The sidecar resolves caller identity from bearer tokens (`<agent-id>:<secret>`) and the mounted context (`/claw/context/<agent-id>/`), enabling dynamic rights enforcement per agent.
@@ -21,10 +21,12 @@ We formalize `cllama` as a **mini-standard** rather than a single hardcoded impl
 
 ## Rationale
 
-Formalizing `cllama` as a standard makes the policy pipeline pluggable. Operators can build custom sidecars with proprietary DLP (Data Loss Prevention) rules, specific compliance checks, or advanced drift scoring, simply by conforming to the OpenAI-compatible proxy interface and consuming the injected Clawdapus context. 
+Formalizing `cllama` as a standard makes the policy pipeline pluggable. Operators can build custom sidecars with proprietary DLP (Data Loss Prevention) rules, specific compliance checks, or advanced drift scoring, simply by conforming to the documented cllama ingress surfaces and consuming the injected Clawdapus context.
 
 Passing identity and rights into the sidecar elevates it from a dumb proxy to a context-aware governance enforcement point, capable of blocking a specific agent from taking an action based on its unique constraints.
 
+ADR-023 later makes the ingress surface matrix explicit so provider identity and runner transport cannot drift apart in driver code.
+
 ## Consequences
 
 **Positive:**

diff --git a/docs/decisions/023-cllama-ingress-surface-matrix.md b/docs/decisions/023-cllama-ingress-surface-matrix.md
@@ -0,0 +1,66 @@
+# ADR-023: Explicit cllama Ingress Surface Matrix
+
+**Date:** 2026-04-09
+**Status:** Accepted
+**Depends on:** ADR-008 (cllama Sidecar Standard)
+**Related to:** ADR-019 (Model Policy Authority and Declared Failover)
+**Implementation:** Issue #134
+
+## Context
+
+ADR-008 established `cllama` as a standardized sidecar interface, but the contract text described it primarily as an OpenAI-compatible proxy. The reference implementation and runtime had already moved beyond that: `cllama` accepts both OpenAI Chat Completions (`/v1/chat/completions`) and Anthropic Messages (`/v1/messages`).
+
+That mismatch became operationally important in issue #127. The outage was not fundamentally about Gemini. The real problem was that the system lacked a single authoritative answer to a more basic question:
+
+**What protocol surface should a runner speak when `cllama` is in the path?**
+
+Without an explicit contract:
+
+- provider identity (`google`, `anthropic`, `openrouter`) became entangled with runner transport selection
+- OpenClaw carried a private provider-to-protocol switch for `cllama`
+- future provider additions could silently reintroduce vendor-native APIs behind `cllama`
+- docs described `cllama` as OpenAI-only while the runtime already supported more than that
+
+## Decision
+
+1. `cllama` owns a canonical ingress surface matrix for runner traffic.
+2. The minimum required ingress surfaces are:
+   - OpenAI Chat Completions: `POST /v1/chat/completions`
+   - Anthropic Messages: `POST /v1/messages`
+3. Provider identity remains in operator-facing model refs.
+   - Example: `google/gemini-3-flash-preview`, `anthropic/claude-sonnet-4`
+   - We do not invent synthetic provider prefixes such as `cllama/google`.
+   - The shared ingress contract rejects reserved synthetic ingress prefixes when compiling cllama-facing config.
+4. When `cllama` is enabled, drivers compile declared model refs to one of the canonical ingress surfaces through shared infrastructure code.
+5. Anthropic-family providers, and other explicit Anthropic-wire exceptions, route through the Anthropic Messages surface.
+6. All other providers route through the OpenAI Chat Completions surface by default.
+7. Vendor-native ingress surfaces are allowed only as explicit, documented exceptions when a concrete runner cannot target the canonical surfaces.
+8. Runner-specific configuration must map from the canonical ingress surface, not directly from provider names.
+
+## Rationale
+
+This keeps the separation of concerns clean:
+
+- provider identity remains stable in the operator contract
+- transport selection becomes infrastructure-owned instead of runner-owned
+- adding a new provider does not require every driver to rediscover which wire protocol should be used behind `cllama`
+
+The result is a smaller and more legible trust boundary. Runners stay untrusted. `cllama` remains the policy and routing layer. Drivers become compilers from model refs to canonical proxy surfaces.
+
+## Consequences
+
+**Positive:**
+- A single shared contract now decides which ingress surface a runner should target behind `cllama`.
+- OpenClaw no longer owns the canonical provider-to-surface decision in a private helper.
+- Future provider additions are less likely to regress into vendor-native routing bugs.
+- The public spec and ADR set can describe the actual runtime instead of a partially outdated approximation.
+
+**Negative:**
+- The contract is intentionally narrower than "support every upstream API natively", which means truly incompatible runners still require explicit adapter work.
+- New canonical surfaces now require an architecture decision, not just a local driver patch.
+
+## Notes
+
+- This ADR extends ADR-008; it does not replace the broader sidecar-standard decision.
+- This ADR does not change ADR-019 model-policy authority. It only formalizes the runner-to-proxy transport contract.
+- Only OpenClaw needed an immediate integration change for this ADR. The other in-tree drivers do not currently compile provider identity into runner-specific API-surface enums; they only rewrite base URLs, API keys, or generic custom-provider fields when cllama is enabled.
diff --git a/internal/cllama/ingress.go b/internal/cllama/ingress.go
@@ -0,0 +1,125 @@
+package cllama
+
+import (
+	"fmt"
+	"sort"
+	"strings"
+)
+
+type IngressSurface string
+
+const (
+	IngressSurfaceOpenAIChatCompletions IngressSurface = "openai-chat-completions"
+	IngressSurfaceAnthropicMessages     IngressSurface = "anthropic-messages"
+)
+
+// NormalizeProviderID canonicalizes provider aliases that Clawdapus accepts in
+// operator-facing model refs before they are compiled to cllama-facing config.
+func NormalizeProviderID(provider string) string {
+	normalized := strings.ToLower(strings.TrimSpace(provider))
+	switch normalized {
+	case "z.ai", "z-ai":
+		return "zai"
+	case "opencode-zen":
+		return "opencode"
+	case "qwen":
+		return "qwen-portal"
+	case "kimi-code":
+		return "kimi-coding"
+	case "bytedance", "doubao":
+		return "volcengine"
+	default:
+		return normalized
+	}
+}
+
+// SplitProviderModelRef splits a provider/model ref and normalizes the provider
+// to the canonical ID Clawdapus uses for cllama wiring. Bare model IDs default
+// to the anthropic provider for compatibility with existing model-ref handling.
+func SplitProviderModelRef(ref string) (string, string, bool) {
+	trimmed := strings.TrimSpace(ref)
+	if trimmed == "" {
+		return "", "", false
+	}
+
+	parts := strings.SplitN(trimmed, "/", 2)
+	provider := "anthropic"
+	modelID := trimmed
+	if len(parts) == 2 {
+		provider = parts[0]
+		modelID = parts[1]
+	}
+
+	provider = NormalizeProviderID(provider)
+	modelID = strings.TrimSpace(modelID)
+	if provider == "cllama" {
+		return "", "", false
+	}
+	if provider == "" || modelID == "" {
+		return "", "", false
+	}
+	return provider, modelID, true
+}
+
+// ProviderQualifiedModelRef returns the normalized provider plus a canonical
+// provider-prefixed model ref for use in cllama-facing runner config.
+func ProviderQualifiedModelRef(ref string) (string, string, bool) {
+	provider, modelID, ok := SplitProviderModelRef(ref)
+	if !ok {
+		return "", "", false
+	}
+	return provider, provider + "/" + modelID, true
+}
+
+// CollectProviderModels groups declared model refs by normalized provider and
+// emits deterministic provider-prefixed model IDs.
+func CollectProviderModels(models map[string]string) (map[string][]string, error) {
+	byProvider := make(map[string]map[string]struct{})
+	for slot, rawRef := range models {
+		if strings.TrimSpace(rawRef) == "" {
+			continue
+		}
+		provider, modelRef, ok := ProviderQualifiedModelRef(rawRef)
+		if !ok {
+			return nil, fmt.Errorf("invalid cllama provider/model ref for slot %q: %q", slot, rawRef)
+		}
+		if _, exists := byProvider[provider]; !exists {
+			byProvider[provider] = make(map[string]struct{})
+		}
+		byProvider[provider][modelRef] = struct{}{}
+	}
+
+	out := make(map[string][]string, len(byProvider))
+	for provider, ids := range byProvider {
+		modelIDs := make([]string, 0, len(ids))
+		for id := range ids {
+			modelIDs = append(modelIDs, id)
+		}
+		sort.Strings(modelIDs)
+		out[provider] = modelIDs
+	}
+	return out, nil
+}
+
+// IngressSurfaceForProvider returns the canonical cllama ingress surface a
+// runner should target for the given provider when cllama is enabled.
+func IngressSurfaceForProvider(provider string) IngressSurface {
+	switch NormalizeProviderID(provider) {
+	case "anthropic", "synthetic", "minimax-portal", "kimi-coding", "cloudflare-ai-gateway", "xiaomi":
+		return IngressSurfaceAnthropicMessages
+	default:
+		return IngressSurfaceOpenAIChatCompletions
+	}
+}
+
+// RequestPath returns the canonical HTTP path for the ingress surface. Runner
+// config templates and proxy docs should derive paths from this contract
+// instead of duplicating string literals.
+func (surface IngressSurface) RequestPath() string {
+	switch surface {
+	case IngressSurfaceAnthropicMessages:
+		return "/v1/messages"
+	default:
+		return "/v1/chat/completions"
+	}
+}
diff --git a/internal/cllama/ingress_test.go b/internal/cllama/ingress_test.go
@@ -0,0 +1,111 @@
+package cllama
+
+import (
+	"reflect"
+	"testing"
+)
+
+func TestNormalizeProviderID(t *testing.T) {
+	tests := []struct {
+		in   string
+		want string
+	}{
+		{in: "google", want: "google"},
+		{in: "z.ai", want: "zai"},
+		{in: "Z-AI", want: "zai"},
+		{in: "qwen", want: "qwen-portal"},
+		{in: "kimi-code", want: "kimi-coding"},
+		{in: "doubao", want: "volcengine"},
+	}
+
+	for _, tc := range tests {
+		if got := NormalizeProviderID(tc.in); got != tc.want {
+			t.Fatalf("NormalizeProviderID(%q) = %q, want %q", tc.in, got, tc.want)
+		}
+	}
+}
+
+func TestSplitProviderModelRefDefaultsBareModelsToAnthropic(t *testing.T) {
+	provider, modelID, ok := SplitProviderModelRef("claude-sonnet-4")
+	if !ok {
+		t.Fatal("expected bare model ref to parse")
+	}
+	if provider != "anthropic" || modelID != "claude-sonnet-4" {
+		t.Fatalf("got provider=%q model=%q", provider, modelID)
+	}
+}
+
+func TestProviderQualifiedModelRefNormalizesProviderAliases(t *testing.T) {
+	provider, modelRef, ok := ProviderQualifiedModelRef("qwen/qwen3-235b-a22b")
+	if !ok {
+		t.Fatal("expected aliased provider ref to parse")
+	}
+	if provider != "qwen-portal" {
+		t.Fatalf("provider = %q, want qwen-portal", provider)
+	}
+	if modelRef != "qwen-portal/qwen3-235b-a22b" {
+		t.Fatalf("modelRef = %q, want qwen-portal/qwen3-235b-a22b", modelRef)
+	}
+}
+
+func TestCollectProviderModelsDeduplicatesAndSorts(t *testing.T) {
+	models := map[string]string{
+		"primary":   "google/gemini-2.5-flash",
+		"fallback":  "anthropic/claude-sonnet-4",
+		"secondary": "google/gemini-2.5-pro",
+		"cheap":     "google/gemini-2.5-flash",
+	}
+
+	got, err := CollectProviderModels(models)
+	if err != nil {
+		t.Fatalf("CollectProviderModels() unexpected error: %v", err)
+	}
+	want := map[string][]string{
+		"anthropic": {"anthropic/claude-sonnet-4"},
+		"google":    {"google/gemini-2.5-flash", "google/gemini-2.5-pro"},
+	}
+	if !reflect.DeepEqual(got, want) {
+		t.Fatalf("CollectProviderModels() = %#v, want %#v", got, want)
+	}
+}
+
+func TestSplitProviderModelRefRejectsSyntheticIngressProviderPrefix(t *testing.T) {
+	if provider, modelID, ok := SplitProviderModelRef("cllama/google/gemini-3-flash-preview"); ok {
+		t.Fatalf("expected synthetic ingress provider prefix to be rejected, got provider=%q model=%q", provider, modelID)
+	}
+}
+
+func TestCollectProviderModelsRejectsSyntheticIngressProviderPrefix(t *testing.T) {
+	_, err := CollectProviderModels(map[string]string{
+		"primary": "cllama/google/gemini-3-flash-preview",
+	})
+	if err == nil {
+		t.Fatal("expected synthetic ingress provider prefix to fail")
+	}
+	if err.Error() != `invalid cllama provider/model ref for slot "primary": "cllama/google/gemini-3-flash-preview"` {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestIngressSurfaceForProvider(t *testing.T) {
+	tests := []struct {
+		provider string
+		want     IngressSurface
+		path     string
+	}{
+		{provider: "google", want: IngressSurfaceOpenAIChatCompletions, path: "/v1/chat/completions"},
+		{provider: "openrouter", want: IngressSurfaceOpenAIChatCompletions, path: "/v1/chat/completions"},
+		{provider: "anthropic", want: IngressSurfaceAnthropicMessages, path: "/v1/messages"},
+		{provider: "kimi-code", want: IngressSurfaceAnthropicMessages, path: "/v1/messages"},
+	}
+
+	for _, tc := range tests {
+		got := IngressSurfaceForProvider(tc.provider)
+		if got != tc.want {
+			t.Fatalf("IngressSurfaceForProvider(%q) = %q, want %q", tc.provider, got, tc.want)
+		}
+		if got.RequestPath() != tc.path {
+			t.Fatalf("IngressSurfaceForProvider(%q).RequestPath() = %q, want %q", tc.provider, got.RequestPath(), tc.path)
+		}
+	}
+}