Skip to content

Support mixed managed and runner-native tool-call responses with ordered execution and agent feedback #165

@mostlydev

Description

@mostlydev

Summary

When a model returns both managed and runner-native tool calls in the same response, cllama currently fail-closes with:

mixed managed and runner-native tool calls are not supported in one model response

That is operationally too brittle now that additive managed tools are normal. Agents naturally emit both classes in one response, their turn aborts, and they are not told how to recover.

Current behavior

Today cllama/internal/proxy/toolmediation.go partitions the response into managed vs runner-native calls and immediately returns 502 if both are present in the same model response.

This means:

  • the turn stops at the proxy boundary
  • the runner never receives an actionable instruction about how to retry
  • operators have to prompt around proxy internals
  • a normal plan like "call service tool, then local file/shell/search tool" fails if the model batches both into one response

Desired behavior

We need two improvements:

  1. Recovery feedback

    • If mixed ownership still cannot be executed safely, the agent/runner should receive an explicit response telling it to split the calls into separate responses in a safe order.
    • The failure should be visible in audit/telemetry instead of looking like a silent stop.
  2. Transparent execution when safe

    • Support mixed managed + runner-native tool-call responses without requiring agents to understand proxy ownership internals.
    • Prefer an ordering that preserves semantics and transcript continuity.

Proposed direction

Treat the mixed response as an ordered sequence rather than an invalid set when it can be reduced safely.

OpenAI / Anthropic

  • Parse tool calls in the exact order emitted by the model.
  • Execute the maximal leading run of managed tool calls inside cllama.
  • If the first runner-native call appears after one or more managed calls:
    • append the managed tool results into the hidden transcript as usual
    • return a runner-visible response containing only the remaining runner-native tool calls in original order
    • persist continuity so the upstream model sees the hidden managed rounds before the runner-native call on the follow-up request
  • If a runner-native call appears before a later managed call in the same response, do not guess by reordering. Fail closed, but return a structured message that instructs the agent to retry with managed calls first and runner-native calls in a later response.

This keeps the proxy from silently reordering the model's plan while still making the common "managed first, native second" pattern work.

Constraints

  • No silent semantic reordering.
  • Maintain current managed-only mediation behavior.
  • Maintain current native-only pass-through behavior.
  • Preserve hidden continuity and session-history tool_trace behavior for managed rounds only.
  • Support both OpenAI-compatible and Anthropic request paths.
  • Streaming re-synthesis for runner-visible native tool-call responses must keep working.

Acceptance criteria

  • Mixed responses where managed calls come first and runner-native calls follow no longer hard-fail.
  • Mixed responses where native calls precede later managed calls still fail closed, but the returned error explicitly tells the agent to split the actions into separate responses and to place managed calls before runner-native calls.
  • Audit/session-history shows a clear managed mediation failure message when the proxy refuses an unsafe mixed order.
  • Regression tests cover OpenAI and Anthropic ordered-mix success and unsafe-order failure.

Likely files

  • cllama/internal/proxy/toolmediation.go
  • cllama/internal/proxy/handler_test.go
  • cllama/internal/proxy/managedcontinuity*.go
  • docs under site/guide/tools.md and site/changelog.md if behavior changes land on master

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions