Skip to content

fix: validate tool_calls responses at provider boundary#23

Merged
stackbilt-admin merged 1 commit intomainfrom
fix/tool-call-validation
Apr 7, 2026
Merged

fix: validate tool_calls responses at provider boundary#23
stackbilt-admin merged 1 commit intomainfrom
fix/tool-call-validation

Conversation

@stackbilt-admin
Copy link
Copy Markdown
Member

Summary

  • Adds validateToolCalls() to BaseProvider that checks each tool call for required fields (id, type, function.name, function.arguments) and correct types
  • Applied in all five providers (OpenAI, Anthropic, Groq, Cerebras, Cloudflare) at the response formatting boundary
  • Malformed entries are dropped with a logger.warn rather than propagated to callers or throwing

Closes #22

Test plan

  • 12 new tests in tool-call-validation.test.ts covering all providers
  • Tests for: missing id, invalid type, empty function.name, non-string arguments, mixed valid/invalid arrays
  • All 146 tests pass (134 existing + 12 new)
  • TypeScript type check clean

🤖 Generated with Claude Code

Add validateToolCalls() to BaseProvider and apply it in all five providers
(OpenAI, Anthropic, Groq, Cerebras, Cloudflare). Malformed entries are
dropped with a warning rather than propagated to the caller.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@stackbilt-admin stackbilt-admin merged commit 5713b66 into main Apr 7, 2026
3 checks passed
@stackbilt-admin stackbilt-admin deleted the fix/tool-call-validation branch April 21, 2026 21:23
stackbilt-admin added a commit that referenced this pull request Apr 21, 2026
…slice 2) (#49)

* feat(schema-drift): wire envelope validation into openai, groq, cerebras (#39 slice 2)

Follow-up to PR #40 (Anthropic slice 1 of #39). Each OpenAI-compat
provider now validates its /chat/completions response envelope at the
provider boundary and throws SchemaDriftError on mismatch — routing
through the factory's fallback chain and firing onSchemaDrift instead
of corrupting downstream consumers silently.

Per-provider schema constants (not a shared import) — each provider's
envelope is an independent API surface, and correlated drift across
providers is a signal worth detecting, not hiding behind DRY.

The "No choices returned" bare throws in openai/groq/cerebras are
replaced with `SchemaDriftError(<provider>, 'choices[0]', 'object',
'undefined')` so empty-choices is fallback-eligible like every other
envelope failure mode, rather than bubbling up as an uncaught generic
Error.

Also upgrades one tool-call-validation test from PR #23: non-string
`function.arguments` is an envelope contract violation (OpenAI spec
says stringified JSON), so it now routes through drift rather than
silent drop. Division of responsibility:
- Envelope shape violations → schema drift → fallback
- Within-envelope semantic issues (empty id/name) → validateToolCalls
  → silent drop (unchanged)

Driven via describe.each over [openai, groq, cerebras] so schema-parity
is enforced by construction — if one provider's schema diverges, its
tests break loudly.

Tests: 229 → 247 (+18 for the 3 new providers, +0 net on
tool-call-validation — one test's assertion was updated).

Cloudflare deferred to its own PR: Workers AI returns heterogeneous
shapes across model families (response / choices / output / result
wrappers, all optional), needing model-family-aware schema selection
rather than a copy-paste template.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(schema-drift): drop unknown tool_call variants at formatResponse boundary

Review on PR #49 flagged: groq/cerebras formatResponse unconditionally
dereferenced `tc.function.name` / `tc.function.arguments`, but the new
schema's discriminated-union intentionally skips unknown `type` values
for forward-compat. A future `code_interpreter`-style variant with no
`function` field would pass schema validation, then throw a bare
TypeError inside `.map()` — bypassing SchemaDriftError and the
onSchemaDrift hook.

Second failure mode: if the unknown variant happens to carry a
function-shaped payload, it gets mis-surfaced as a normal function
tool call.

Fix: filter `tool_calls` by `type === 'function'` before the map, so
the schema's forward-compat skip is matched by the provider code's
forward-compat drop. OpenAI's version already passes `tc.function`
through untouched, and `validateToolCalls` drops null-function entries
downstream, so no change there.

Test upgrade: the original slice-2 test used a function-shaped mock
and only asserted res.content, so it missed both failure modes. New
test omits the `function` field (exercises the TypeError path) and
asserts `res.toolCalls === undefined` (exercises the mis-surface
path).

All 247 tests passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: validate tool_calls responses at provider boundary

1 participant