Skip to content

fix(ai): preserve providerMetadata as providerOptions in multi-turn tool calls#733

Merged
VaguelySerious merged 1 commit intomainfrom
pranaygp/fix-727
Jan 8, 2026
Merged

fix(ai): preserve providerMetadata as providerOptions in multi-turn tool calls#733
VaguelySerious merged 1 commit intomainfrom
pranaygp/fix-727

Conversation

@pranaygp
Copy link
Copy Markdown
Contributor

@pranaygp pranaygp commented Jan 7, 2026

Summary

  • Fixes Gemini thinking models (e.g., gemini-3-pro-preview) that require thoughtSignature to be preserved across multi-turn tool calls
  • Maps providerMetadata from tool call responses to providerOptions in the conversation prompt, following the AI SDK convention

Problem

When using Gemini thinking models with tool calls, multi-turn conversations fail with:

function call is missing a thought_signature

This happens because providerMetadata (which contains thoughtSignature) from the tool call was not being passed back to the provider when reconstructing the conversation prompt.

Solution

In stream-text-iterator.ts, when adding tool calls to the conversation history, we now map providerMetadataproviderOptions:

conversationPrompt.push({
  role: 'assistant',
  content: toolCalls.map((toolCall) => ({
    type: 'tool-call',
    toolCallId: toolCall.toolCallId,
    toolName: toolCall.toolName,
    input: JSON.parse(toolCall.input),
    ...(toolCall.providerMetadata != null
      ? { providerOptions: toolCall.providerMetadata }
      : {}),
  })),
});

This follows the same pattern used in the AI SDK's to-response-messages.ts.

Testing

Added comprehensive tests in stream-text-iterator.test.ts:

  • Preserving providerMetadata as providerOptions in tool-call messages
  • Not adding providerOptions when providerMetadata is undefined
  • Preserving providerMetadata for multiple parallel tool calls
  • Handling mixed tool calls with and without providerMetadata

Fixes #727

Copilot AI review requested due to automatic review settings January 7, 2026 06:18
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jan 7, 2026

🦋 Changeset detected

Latest commit: 7da7897

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@workflow/ai Patch
@workflow/docs-typecheck Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Jan 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Jan 7, 2026 6:23am
example-nextjs-workflow-webpack Ready Ready Preview, Comment Jan 7, 2026 6:23am
example-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-astro-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-express-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-fastify-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-hono-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-nitro-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-nuxt-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-sveltekit-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workbench-vite-workflow Ready Ready Preview, Comment Jan 7, 2026 6:23am
workflow-docs Ready Ready Preview, Comment Jan 7, 2026 6:23am

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 7, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
✅ ▲ Vercel Production 363 0 11 374
❌ 💻 Local Development 299 33 8 340
✅ 📦 Local Production 332 0 8 340
✅ 🐘 Local Postgres 332 0 8 340
✅ 🪟 Windows 34 0 0 34
❌ 🌍 Community Worlds 131 17 0 148
Total 1491 50 35 1576

❌ Failed Tests

💻 Local Development (33 failed)

nitro-stable (33 failed):

  • addTenWorkflow
  • addTenWorkflow
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • readableStreamWorkflow
  • hookWorkflow
  • webhookWorkflow
  • webhook route with invalid token
  • sleepingWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • outputStreamWorkflow
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • stepDirectCallWorkflow - calling step functions directly outside workflow context
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • health check endpoint - workflow and step endpoints respond to __health query parameter
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly
🌍 Community Worlds (17 failed)

mongodb (1 failed):

  • webhookWorkflow

redis (1 failed):

  • webhookWorkflow

starter (14 failed):

  • addTenWorkflow
  • addTenWorkflow
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly

turso (1 failed):

  • webhookWorkflow

Details by Category

✅ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 33 0 1
✅ example 33 0 1
✅ express 33 0 1
✅ fastify 33 0 1
✅ hono 33 0 1
✅ nextjs-turbopack 33 0 1
✅ nextjs-webpack 33 0 1
✅ nitro 33 0 1
✅ nuxt 33 0 1
✅ sveltekit 33 0 1
✅ vite 33 0 1
❌ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 33 0 1
✅ express-stable 33 0 1
✅ fastify-stable 33 0 1
✅ hono-stable 33 0 1
✅ nextjs-turbopack-stable 34 0 0
✅ nextjs-webpack-stable 34 0 0
❌ nitro-stable 0 33 1
✅ nuxt-stable 33 0 1
✅ sveltekit-stable 33 0 1
✅ vite-stable 33 0 1
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 33 0 1
✅ express-stable 33 0 1
✅ fastify-stable 33 0 1
✅ hono-stable 33 0 1
✅ nextjs-turbopack-stable 34 0 0
✅ nextjs-webpack-stable 34 0 0
✅ nitro-stable 33 0 1
✅ nuxt-stable 33 0 1
✅ sveltekit-stable 33 0 1
✅ vite-stable 33 0 1
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 33 0 1
✅ express-stable 33 0 1
✅ fastify-stable 33 0 1
✅ hono-stable 33 0 1
✅ nextjs-turbopack-stable 34 0 0
✅ nextjs-webpack-stable 34 0 0
✅ nitro-stable 33 0 1
✅ nuxt-stable 33 0 1
✅ sveltekit-stable 33 0 1
✅ vite-stable 33 0 1
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 34 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 3 0 0
❌ mongodb 33 1 0
✅ redis-dev 3 0 0
❌ redis 33 1 0
✅ starter-dev 3 0 0
❌ starter 20 14 0
✅ turso-dev 3 0 0
❌ turso 33 1 0

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: success
  • Local Dev: failure
  • Local Prod: success
  • Local Postgres: success
  • Windows: success

Check the workflow run for details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 7, 2026

📊 Benchmark Results

📈 Comparing against baseline from main branch. Green 🟢 = faster, Red 🔺 = slower.

workflow with no steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Starter 🥇 Next.js (Turbopack) 0.037s (-1.1%) 1.014s (~) 0.977s 10 1.00x
💻 Local Next.js (Turbopack) 0.038s (+10.7% 🔺) 1.014s (~) 0.975s 10 1.03x
🌐 Redis Next.js (Turbopack) 0.041s (-1.7%) 1.017s (~) 0.976s 10 1.11x
💻 Local Nitro 0.044s (-10.2% 🟢) 1.006s (-0.8%) 0.962s 10 1.18x
💻 Local Express 0.046s (+5.2% 🔺) 1.008s (~) 0.961s 10 1.25x
🌐 MongoDB Next.js (Turbopack) 0.083s (-30.1% 🟢) 1.016s (~) 0.934s 10 2.23x
🌐 Turso Next.js (Turbopack) 0.106s (+26.5% 🔺) 1.014s (~) 0.907s 10 2.86x
🐘 Postgres Express 0.221s (-35.6% 🟢) 1.014s (~) 0.793s 10 5.95x
🐘 Postgres Next.js (Turbopack) 0.252s (+65.9% 🔺) 1.029s (+1.0%) 0.777s 10 6.78x
🐘 Postgres Nitro 0.299s (+0.9%) 1.012s (~) 0.714s 10 8.03x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 0.575s (-15.6% 🟢) 1.423s (-15.2% 🟢) 0.848s 10 1.00x
▲ Vercel Express 0.609s (-6.4% 🟢) 1.435s (-15.6% 🟢) 0.826s 10 1.06x
▲ Vercel Next.js (Turbopack) 0.709s (-5.5% 🟢) 1.532s (-9.8% 🟢) 0.823s 10 1.23x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 1 step

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Starter 🥇 Next.js (Turbopack) 1.086s (+0.5%) 2.011s (~) 0.925s 10 1.00x
💻 Local Next.js (Turbopack) 1.098s (+1.5%) 2.012s (~) 0.914s 10 1.01x
🌐 Redis Next.js (Turbopack) 1.106s (~) 2.012s (~) 0.906s 10 1.02x
💻 Local Nitro 1.113s (~) 2.006s (~) 0.893s 10 1.02x
💻 Local Express 1.118s (~) 2.008s (~) 0.890s 10 1.03x
🌐 MongoDB Next.js (Turbopack) 1.275s (-1.5%) 2.017s (~) 0.742s 10 1.17x
🌐 Turso Next.js (Turbopack) 1.308s (+1.3%) 2.012s (~) 0.704s 10 1.20x
🐘 Postgres Next.js (Turbopack) 1.660s (-11.0% 🟢) 2.015s (~) 0.355s 10 1.53x
🐘 Postgres Nitro 2.158s (~) 3.013s (~) 0.855s 10 1.99x
🐘 Postgres Express 2.254s (+2.6%) 3.014s (~) 0.761s 10 2.08x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.637s (-11.0% 🟢) 3.575s (-4.2%) 0.938s 10 1.00x
▲ Vercel Express 2.650s (-16.4% 🟢) 3.658s (-12.3% 🟢) 1.007s 10 1.01x
▲ Vercel Next.js (Turbopack) 2.712s (-11.7% 🟢) 3.530s (-10.1% 🟢) 0.817s 10 1.03x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 10 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Starter 🥇 Next.js (Turbopack) 10.606s (~) 11.012s (~) 0.406s 5 1.00x
💻 Local Next.js (Turbopack) 10.653s (+1.4%) 11.018s (~) 0.365s 5 1.00x
🌐 Redis Next.js (Turbopack) 10.666s (~) 11.016s (~) 0.349s 5 1.01x
💻 Local Nitro 10.794s (~) 11.012s (~) 0.218s 5 1.02x
💻 Local Express 10.819s (~) 11.016s (~) 0.197s 5 1.02x
🌐 MongoDB Next.js (Turbopack) 12.068s (-1.2%) 12.628s (-3.1%) 0.560s 5 1.14x
🌐 Turso Next.js (Turbopack) 12.182s (~) 13.023s (~) 0.840s 5 1.15x
🐘 Postgres Next.js (Turbopack) 13.714s (-9.9% 🟢) 14.235s (-11.2% 🟢) 0.521s 5 1.29x
🐘 Postgres Express 20.395s (~) 21.036s (~) 0.641s 5 1.92x
🐘 Postgres Nitro 20.505s (~) 21.029s (~) 0.525s 5 1.93x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 22.177s (+1.2%) 22.860s (+1.0%) 0.683s 5 1.00x
▲ Vercel Next.js (Turbopack) 22.274s (~) 22.897s (-0.9%) 0.623s 5 1.00x
▲ Vercel Express 22.821s (+1.9%) 23.459s (~) 0.638s 5 1.03x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

Promise.all with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 1.348s (-1.1%) 2.010s (~) 0.662s 15 1.00x
🌐 Starter Next.js (Turbopack) 1.361s (+1.4%) 2.008s (~) 0.646s 15 1.01x
💻 Local Next.js (Turbopack) 1.383s (+3.5%) 2.012s (~) 0.629s 15 1.03x
💻 Local Nitro 1.415s (~) 2.005s (~) 0.590s 15 1.05x
💻 Local Express 1.429s (+1.7%) 2.007s (~) 0.579s 15 1.06x
🐘 Postgres Next.js (Turbopack) 1.953s (+1.5%) 2.160s (~) 0.207s 14 1.45x
🌐 MongoDB Next.js (Turbopack) 2.144s (+1.2%) 3.015s (~) 0.871s 10 1.59x
🌐 Turso Next.js (Turbopack) 2.203s (-0.9%) 3.013s (~) 0.810s 10 1.63x
🐘 Postgres Nitro 2.449s (+3.3%) 3.010s (~) 0.562s 10 1.82x
🐘 Postgres Express 2.510s (+2.9%) 3.015s (~) 0.505s 10 1.86x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.821s (~) 3.588s (-9.5% 🟢) 0.767s 9 1.00x
▲ Vercel Nitro 3.075s (-9.4% 🟢) 4.000s (-7.9% 🟢) 0.925s 8 1.09x
▲ Vercel Next.js (Turbopack) 3.140s (-11.6% 🟢) 3.954s (-9.1% 🟢) 0.814s 8 1.11x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.all with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 2.105s (+5.5% 🔺) 3.022s (+39.8% 🔺) 0.917s 10 1.00x
💻 Local Express 2.209s (-0.6%) 3.159s (~) 0.951s 10 1.05x
💻 Local Nitro 2.217s (+1.5%) 3.177s (+1.5%) 0.960s 10 1.05x
🌐 Starter Next.js (Turbopack) 2.477s (+0.8%) 3.011s (~) 0.534s 10 1.18x
🌐 Redis Next.js (Turbopack) 2.528s (+1.4%) 3.034s (+0.7%) 0.506s 10 1.20x
🐘 Postgres Next.js (Turbopack) 2.630s (+4.2%) 3.013s (~) 0.383s 10 1.25x
🐘 Postgres Nitro 2.854s (+2.3%) 3.012s (-0.7%) 0.158s 10 1.36x
🐘 Postgres Express 3.060s (+5.6% 🔺) 3.362s (+8.0% 🔺) 0.302s 9 1.45x
🌐 MongoDB Next.js (Turbopack) 4.718s (+1.9%) 5.184s (~) 0.466s 6 2.24x
🌐 Turso Next.js (Turbopack) 4.744s (+0.5%) 5.184s (~) 0.440s 6 2.25x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 3.351s (+12.7% 🔺) 3.810s (+6.0% 🔺) 0.459s 9 1.00x
▲ Vercel Express 3.432s (+4.2%) 3.969s (-1.9%) 0.538s 8 1.02x
▲ Vercel Next.js (Turbopack) 3.495s (+2.3%) 3.962s (-3.3%) 0.467s 8 1.04x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Promise.race with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Starter 🥇 Next.js (Turbopack) 1.342s (-1.0%) 2.007s (~) 0.665s 15 1.00x
🌐 Redis Next.js (Turbopack) 1.376s (+1.5%) 2.010s (~) 0.635s 15 1.02x
💻 Local Next.js (Turbopack) 1.395s (+2.8%) 2.014s (~) 0.619s 15 1.04x
💻 Local Express 1.420s (-0.7%) 2.006s (~) 0.585s 15 1.06x
💻 Local Nitro 1.422s (+0.7%) 2.006s (~) 0.585s 15 1.06x
🐘 Postgres Next.js (Turbopack) 1.663s (-6.6% 🟢) 2.012s (~) 0.349s 15 1.24x
🐘 Postgres Express 1.784s (-21.3% 🟢) 2.155s (-28.7% 🟢) 0.372s 14 1.33x
🐘 Postgres Nitro 1.975s (+8.0% 🔺) 2.323s (+15.7% 🔺) 0.348s 13 1.47x
🌐 MongoDB Next.js (Turbopack) 2.128s (-0.8%) 3.014s (~) 0.886s 10 1.59x
🌐 Turso Next.js (Turbopack) 2.225s (~) 3.012s (~) 0.787s 10 1.66x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.745s (-3.5%) 3.644s (-5.3% 🟢) 0.900s 9 1.00x
▲ Vercel Nitro 2.886s (+4.4%) 3.659s (+2.1%) 0.773s 9 1.05x
▲ Vercel Next.js (Turbopack) 2.890s (+3.8%) 3.646s (-1.6%) 0.756s 9 1.05x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.race with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 2.210s (+7.3% 🔺) 3.166s (+19.7% 🔺) 0.956s 10 1.00x
💻 Local Nitro 2.237s (+0.9%) 3.197s (+1.0%) 0.960s 10 1.01x
💻 Local Express 2.265s (-0.6%) 3.203s (~) 0.938s 10 1.02x
🌐 Starter Next.js (Turbopack) 2.452s (-0.6%) 3.010s (~) 0.557s 10 1.11x
🌐 Redis Next.js (Turbopack) 2.507s (+1.2%) 3.012s (~) 0.505s 10 1.13x
🐘 Postgres Next.js (Turbopack) 2.664s (+9.7% 🔺) 3.020s (~) 0.356s 10 1.21x
🐘 Postgres Express 2.879s (~) 3.133s (+3.9%) 0.253s 10 1.30x
🐘 Postgres Nitro 2.966s (+18.7% 🔺) 3.113s (+3.2%) 0.147s 10 1.34x
🌐 Turso Next.js (Turbopack) 4.704s (-1.2%) 5.183s (~) 0.479s 6 2.13x
🌐 MongoDB Next.js (Turbopack) 4.705s (+0.6%) 5.188s (~) 0.483s 6 2.13x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 3.355s (+4.6%) 4.055s (+5.8% 🔺) 0.700s 8 1.00x
▲ Vercel Nitro 3.438s (+7.9% 🔺) 4.112s (+11.9% 🔺) 0.674s 8 1.02x
▲ Vercel Next.js (Turbopack) 3.828s (+18.1% 🔺) 4.480s (+16.5% 🔺) 0.652s 7 1.14x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Stream Benchmarks (includes TTFB metrics)
workflow with stream

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 0.136s (+34.8% 🔺) 1.003s (~) 0.016s (-2.4%) 1.027s (~) 0.891s 10 1.00x
🌐 Starter Next.js (Turbopack) 0.139s (+7.9% 🔺) 1.005s (~) 0.000s (NaN%) 1.011s (~) 0.873s 10 1.02x
🌐 Redis Next.js (Turbopack) 0.151s (+5.0%) 1.005s (~) 0.000s (NaN%) 1.014s (~) 0.863s 10 1.11x
💻 Local Nitro 0.177s (+1.4%) 0.993s (~) 0.020s (+28.9% 🔺) 1.026s (~) 0.849s 10 1.30x
💻 Local Express 0.186s (+6.0% 🔺) 0.993s (~) 0.015s (-10.2% 🟢) 1.023s (~) 0.837s 10 1.37x
🌐 Turso Next.js (Turbopack) 0.478s (-5.6% 🟢) 0.973s (+2.9%) 0.000s (+Infinity% 🔺) 1.014s (~) 0.536s 10 3.51x
🌐 MongoDB Next.js (Turbopack) 0.496s (+5.9% 🔺) 0.952s (-2.9%) 0.000s (+Infinity% 🔺) 1.015s (~) 0.519s 10 3.64x
🐘 Postgres Next.js (Turbopack) 0.784s (-33.9% 🟢) 0.883s (-52.6% 🟢) 0.000s (+Infinity% 🔺) 1.015s (-49.7% 🟢) 0.231s 10 5.75x
🐘 Postgres Express 2.183s (-7.3% 🟢) 2.859s (+6.5% 🔺) 0.000s (+Infinity% 🔺) 3.015s (~) 0.832s 10 16.02x
🐘 Postgres Nitro 2.334s (+1.1%) 2.706s (-1.0%) 0.000s (NaN%) 3.012s (~) 0.678s 10 17.12x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.840s (+1.7%) 3.169s (-3.9%) 0.680s (+110.8% 🔺) 4.326s (+4.8%) 1.486s 10 1.00x
▲ Vercel Next.js (Turbopack) 2.848s (+1.5%) 3.248s (+2.6%) 0.502s (-23.1% 🟢) 4.162s (-2.8%) 1.314s 10 1.00x
▲ Vercel Nitro 2.880s (-8.2% 🟢) 3.203s (-13.0% 🟢) 0.631s (-28.5% 🟢) 4.273s (-13.7% 🟢) 1.393s 10 1.01x

🔍 Observability: Express | Next.js (Turbopack) | Nitro

Summary

Fastest Framework by World

Winner determined by most benchmark wins

World 🥇 Fastest Framework Wins
💻 Local Next.js (Turbopack) 8/8
🐘 Postgres Next.js (Turbopack) 7/8
▲ Vercel Nitro 4/8
Fastest World by Framework

Winner determined by most benchmark wins

Framework 🥇 Fastest World Wins
Express 💻 Local 8/8
Next.js (Turbopack) 🌐 Starter 4/8
Nitro 💻 Local 8/8
Column Definitions
  • Workflow Time: Runtime reported by workflow (completedAt - createdAt) - primary metric
  • TTFB: Time to First Byte - time from workflow start until first stream byte received (stream benchmarks only)
  • Slurp: Time from first byte to complete stream consumption (stream benchmarks only)
  • Wall Time: Total testbench time (trigger workflow + poll for result)
  • Overhead: Testbench overhead (Wall Time - Workflow Time)
  • Samples: Number of benchmark iterations run
  • vs Fastest: How much slower compared to the fastest configuration for this benchmark

Worlds:

  • 💻 Local: In-memory filesystem world (local development)
  • 🐘 Postgres: PostgreSQL database world (local development)
  • ▲ Vercel: Vercel production/preview deployment
  • 🌐 Starter: Community world (local development)
  • 🌐 Turso: Community world (local development)
  • 🌐 MongoDB: Community world (local development)
  • 🌐 Redis: Community world (local development)
  • 🌐 Jazz: Community world (local development)

📋 View full workflow run

…ool calls

When tool calls are added to the conversation history, map providerMetadata
to providerOptions following the AI SDK convention. This fixes Gemini thinking
models that require thoughtSignature to be preserved across multi-turn tool calls,
preventing the error 'function call is missing a thought_signature'.

Fixes #727
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug in Gemini thinking models (e.g., gemini-3-pro-preview) where multi-turn tool calls were failing with "function call is missing a thought_signature" error. The fix ensures that providerMetadata from tool calls is correctly mapped to providerOptions when reconstructing the conversation prompt for subsequent turns.

Key Changes:

  • Map providerMetadata to providerOptions when adding tool calls to conversation history
  • Add comprehensive test coverage for the providerMetadata preservation logic
  • Document the fix in a changeset for release notes

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
packages/ai/src/agent/stream-text-iterator.ts Added logic to map providerMetadata from tool calls to providerOptions in the conversation prompt, with detailed inline comments explaining the critical nature of this mapping for Gemini thinking models
packages/ai/src/agent/stream-text-iterator.test.ts Added comprehensive test suite with 4 test cases covering: single tool call with metadata, tool call without metadata, multiple parallel tool calls, and mixed scenarios with and without metadata
.changeset/fix-provider-metadata-tool-calls.md Added changeset describing the bug fix for release documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

];

// Second iteration - should trigger second doStreamStep call
const secondResult = await iterator.next(toolResults);
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable secondResult.

Suggested change
const secondResult = await iterator.next(toolResults);
await iterator.next(toolResults);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, test looks not too useful, but no harm in merging IMO

@VaguelySerious VaguelySerious merged commit 4b43186 into main Jan 8, 2026
89 of 91 checks passed
@VaguelySerious VaguelySerious deleted the pranaygp/fix-727 branch January 8, 2026 09:56
TooTallNate added a commit that referenced this pull request Jan 12, 2026
* feat: add queue-based health check to bypass Deployment Protection

- Add HealthCheckPayloadSchema and HEALTH_CHECK_STREAM_PREFIX to @workflow/world
- Add healthCheck() method to Queue interface
- Update workflow and step handlers to detect and respond to health check messages
- Implement healthCheck() in world-local, world-vercel, and world-postgres

The queue-based health check sends a message through the queue pipeline,
which bypasses Vercel's Deployment Protection. The handler writes a response
to a stream that the caller reads to confirm health.

This complements the existing HTTP-based ?__health approach which still works
for local development and when bypass headers are available.

* refactor: move healthCheck to core package as utility function

Instead of adding healthCheck to the World interface (which duplicated
the same implementation across all worlds), this is now a utility function
in @workflow/core that takes the World as a parameter.

Usage:
  import { healthCheck } from '@workflow/core';
  const result = await healthCheck(world, 'workflow');

This is cleaner because:
- Single implementation instead of 3 identical ones
- World implementations remain simple
- No changes needed to the World interface

* .

* refactor: move health check types from world to core

Health check types (HealthCheckPayloadSchema, HealthCheckResult, etc.)
are now defined in @workflow/core since that's where they're used.

The HealthCheckPayloadSchema is still part of QueuePayloadSchema in
world (so the queue accepts health check messages), but it's not
exported from the public API.

* .

* Refactor health check implementation based on code review feedback (#746)

* Initial plan

* Address PR review comments: export types, fix race condition, improve error handling

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Add queue-based health check test and document security considerations

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Replace 'any' type with proper type guards for health check response

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Extract health check queue names as constants and improve type guards

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* .

* Fix e2e test

* .

* .

* .

* fix(ai): preserve providerMetadata as providerOptions in multi-turn tool calls (#733)

When tool calls are added to the conversation history, map providerMetadata
to providerOptions following the AI SDK convention. This fixes Gemini thinking
models that require thoughtSignature to be preserved across multi-turn tool calls,
preventing the error 'function call is missing a thought_signature'.

Fixes #727

* Local ui cli flag (#744)

* [web] Increase contrast on attribute items in sidebar (#736)

Signed-off-by: Peter Wielander <mittgfu@gmail.com>

* [world] Remove pause and resume events, actions and states (#751)

* Version Packages (beta) (#735)

* .

* .

* Update turbo inputs to include shared config (#752)

* Update turbo inputs to include shared config

* Apply suggestions from code review

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

---------

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

* feat(web): add self-hosted mode for world configuration (#747)

* feat(web): add self-hosted mode for world configuration

When WORKFLOW_TARGET_WORLD env var is set, the web UI operates in
self-hosted mode where the world configuration is locked to server-side
environment variables and cannot be changed via query params or UI.

- Add getHardcodedConfig server action to detect self-hosted mode
- Modify getWorldFromEnv to use server env vars in hardcoded mode
- Create WorldConfigContext to provide config state app-wide
- Update settings sidebar to show locked state with disabled inputs
- Update connection status to show PostgreSQL backend info
- Mask sensitive values (postgres URL) in hardcoded mode UI

* fix: address PR review feedback

- Remove unused ConfigMode type export
- Fix postgres substring to undefined (tooltip has details)
- Extract buildEnvMapFromProcessEnv helper to reduce duplication
- Remove unused EnvMap import from layout-client
- Import HardcodedConfig from web-shared/server instead of re-defining

* Fix: PostgreSQL URL parameter missing from configParsers, causing loss of postgres URL configuration on page reload in dynamic mode

* fix(cli): clear WORKFLOW_TARGET_WORLD when spawning web server

The CLI sets WORKFLOW_TARGET_WORLD as an env var, which the spawned
Next.js server inherits. This caused the web UI to enter self-hosted
mode even when launched via CLI.

Now we explicitly clear WORKFLOW_TARGET_WORLD from the server's
environment so it starts in dynamic mode where config comes from
query params as intended.

* refactor(web): use server-side env vars for world config

BREAKING CHANGE: The web UI no longer supports configuring the world
backend via URL query parameters. Configuration is now read exclusively
from server-side environment variables.

Changes:
- Remove query param parsing from @workflow/web config.ts
- Add ServerConfig interface with non-sensitive display info
- Update all components to use useServerConfig() hook
- Settings sidebar is now read-only
- CLI passes env vars to spawned web server instead of query params
- Server actions use process.env directly (envMap param reserved for future use)

This simplifies the architecture and improves security by never sending
sensitive data (connection strings, auth tokens) to the client.

* fix(web): fix settings sidebar overflow and shorten data dir path

- Add truncate/overflow handling to settings sidebar config values
- Add shortenPath() helper to abbreviate long file paths:
  - Replaces home directory with ~
  - Shows .../last-two-segments if still too long
- Add title attributes for full path on hover

* Update changeest

---------

Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>

* Version Packages (beta) (#755)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update packages/world/src/queue.ts

Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>

* [web] Tidy wake-up and re-enqueue buttons (#737)


---------

Signed-off-by: Peter Wielander <mittgfu@gmail.com>

* [cli] Use dotenv to resolve .env and .env.local files on startup (#765)

* Use temporary workflow-server deployment URL

* feat: add queue-based health check to bypass Deployment Protection

- Add HealthCheckPayloadSchema and HEALTH_CHECK_STREAM_PREFIX to @workflow/world
- Add healthCheck() method to Queue interface
- Update workflow and step handlers to detect and respond to health check messages
- Implement healthCheck() in world-local, world-vercel, and world-postgres

The queue-based health check sends a message through the queue pipeline,
which bypasses Vercel's Deployment Protection. The handler writes a response
to a stream that the caller reads to confirm health.

This complements the existing HTTP-based ?__health approach which still works
for local development and when bypass headers are available.

* refactor: move healthCheck to core package as utility function

Instead of adding healthCheck to the World interface (which duplicated
the same implementation across all worlds), this is now a utility function
in @workflow/core that takes the World as a parameter.

Usage:
  import { healthCheck } from '@workflow/core';
  const result = await healthCheck(world, 'workflow');

This is cleaner because:
- Single implementation instead of 3 identical ones
- World implementations remain simple
- No changes needed to the World interface

* .

* refactor: move health check types from world to core

Health check types (HealthCheckPayloadSchema, HealthCheckResult, etc.)
are now defined in @workflow/core since that's where they're used.

The HealthCheckPayloadSchema is still part of QueuePayloadSchema in
world (so the queue accepts health check messages), but it's not
exported from the public API.

* .

* Refactor health check implementation based on code review feedback (#746)

* Initial plan

* Address PR review comments: export types, fix race condition, improve error handling

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Add queue-based health check test and document security considerations

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Replace 'any' type with proper type guards for health check response

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Extract health check queue names as constants and improve type guards

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* .

* Fix e2e test

* .

* .

* .

* .

* .

* Update packages/world/src/queue.ts

Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>

* Use temporary workflow-server deployment URL

* .

* .

---------

Signed-off-by: Peter Wielander <mittgfu@gmail.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>
Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>
Co-authored-by: Peter Wielander <mittgfu@gmail.com>
Co-authored-by: Vercel Release Bot <88769842+vercel-release-bot@users.noreply.github.com>
Co-authored-by: JJ Kasper <jj@jjsweb.site>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
VaguelySerious added a commit that referenced this pull request Jan 16, 2026
* feat: add queue-based health check to bypass Deployment Protection

- Add HealthCheckPayloadSchema and HEALTH_CHECK_STREAM_PREFIX to @workflow/world
- Add healthCheck() method to Queue interface
- Update workflow and step handlers to detect and respond to health check messages
- Implement healthCheck() in world-local, world-vercel, and world-postgres

The queue-based health check sends a message through the queue pipeline,
which bypasses Vercel's Deployment Protection. The handler writes a response
to a stream that the caller reads to confirm health.

This complements the existing HTTP-based ?__health approach which still works
for local development and when bypass headers are available.

* refactor: move healthCheck to core package as utility function

Instead of adding healthCheck to the World interface (which duplicated
the same implementation across all worlds), this is now a utility function
in @workflow/core that takes the World as a parameter.

Usage:
  import { healthCheck } from '@workflow/core';
  const result = await healthCheck(world, 'workflow');

This is cleaner because:
- Single implementation instead of 3 identical ones
- World implementations remain simple
- No changes needed to the World interface

* .

* refactor: move health check types from world to core

Health check types (HealthCheckPayloadSchema, HealthCheckResult, etc.)
are now defined in @workflow/core since that's where they're used.

The HealthCheckPayloadSchema is still part of QueuePayloadSchema in
world (so the queue accepts health check messages), but it's not
exported from the public API.

* .

* Refactor health check implementation based on code review feedback (#746)

* Initial plan

* Address PR review comments: export types, fix race condition, improve error handling

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Add queue-based health check test and document security considerations

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Replace 'any' type with proper type guards for health check response

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Extract health check queue names as constants and improve type guards

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* .

* Fix e2e test

* .

* .

* .

* fix(ai): preserve providerMetadata as providerOptions in multi-turn tool calls (#733)

When tool calls are added to the conversation history, map providerMetadata
to providerOptions following the AI SDK convention. This fixes Gemini thinking
models that require thoughtSignature to be preserved across multi-turn tool calls,
preventing the error 'function call is missing a thought_signature'.

Fixes #727

* Local ui cli flag (#744)

* [web] Increase contrast on attribute items in sidebar (#736)

Signed-off-by: Peter Wielander <mittgfu@gmail.com>

* [world] Remove pause and resume events, actions and states (#751)

* Version Packages (beta) (#735)

* .

* .

* Update turbo inputs to include shared config (#752)

* Update turbo inputs to include shared config

* Apply suggestions from code review

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

---------

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

* feat(web): add self-hosted mode for world configuration (#747)

* feat(web): add self-hosted mode for world configuration

When WORKFLOW_TARGET_WORLD env var is set, the web UI operates in
self-hosted mode where the world configuration is locked to server-side
environment variables and cannot be changed via query params or UI.

- Add getHardcodedConfig server action to detect self-hosted mode
- Modify getWorldFromEnv to use server env vars in hardcoded mode
- Create WorldConfigContext to provide config state app-wide
- Update settings sidebar to show locked state with disabled inputs
- Update connection status to show PostgreSQL backend info
- Mask sensitive values (postgres URL) in hardcoded mode UI

* fix: address PR review feedback

- Remove unused ConfigMode type export
- Fix postgres substring to undefined (tooltip has details)
- Extract buildEnvMapFromProcessEnv helper to reduce duplication
- Remove unused EnvMap import from layout-client
- Import HardcodedConfig from web-shared/server instead of re-defining

* Fix: PostgreSQL URL parameter missing from configParsers, causing loss of postgres URL configuration on page reload in dynamic mode

* fix(cli): clear WORKFLOW_TARGET_WORLD when spawning web server

The CLI sets WORKFLOW_TARGET_WORLD as an env var, which the spawned
Next.js server inherits. This caused the web UI to enter self-hosted
mode even when launched via CLI.

Now we explicitly clear WORKFLOW_TARGET_WORLD from the server's
environment so it starts in dynamic mode where config comes from
query params as intended.

* refactor(web): use server-side env vars for world config

BREAKING CHANGE: The web UI no longer supports configuring the world
backend via URL query parameters. Configuration is now read exclusively
from server-side environment variables.

Changes:
- Remove query param parsing from @workflow/web config.ts
- Add ServerConfig interface with non-sensitive display info
- Update all components to use useServerConfig() hook
- Settings sidebar is now read-only
- CLI passes env vars to spawned web server instead of query params
- Server actions use process.env directly (envMap param reserved for future use)

This simplifies the architecture and improves security by never sending
sensitive data (connection strings, auth tokens) to the client.

* fix(web): fix settings sidebar overflow and shorten data dir path

- Add truncate/overflow handling to settings sidebar config values
- Add shortenPath() helper to abbreviate long file paths:
  - Replaces home directory with ~
  - Shows .../last-two-segments if still too long
- Add title attributes for full path on hover

* Update changeest

---------

Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>

* Version Packages (beta) (#755)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update packages/world/src/queue.ts

Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>

* [web] Tidy wake-up and re-enqueue buttons (#737)


---------

Signed-off-by: Peter Wielander <mittgfu@gmail.com>

* [cli] Use dotenv to resolve .env and .env.local files on startup (#765)

* Use temporary workflow-server deployment URL

* feat: add queue-based health check to bypass Deployment Protection

- Add HealthCheckPayloadSchema and HEALTH_CHECK_STREAM_PREFIX to @workflow/world
- Add healthCheck() method to Queue interface
- Update workflow and step handlers to detect and respond to health check messages
- Implement healthCheck() in world-local, world-vercel, and world-postgres

The queue-based health check sends a message through the queue pipeline,
which bypasses Vercel's Deployment Protection. The handler writes a response
to a stream that the caller reads to confirm health.

This complements the existing HTTP-based ?__health approach which still works
for local development and when bypass headers are available.

* refactor: move healthCheck to core package as utility function

Instead of adding healthCheck to the World interface (which duplicated
the same implementation across all worlds), this is now a utility function
in @workflow/core that takes the World as a parameter.

Usage:
  import { healthCheck } from '@workflow/core';
  const result = await healthCheck(world, 'workflow');

This is cleaner because:
- Single implementation instead of 3 identical ones
- World implementations remain simple
- No changes needed to the World interface

* .

* refactor: move health check types from world to core

Health check types (HealthCheckPayloadSchema, HealthCheckResult, etc.)
are now defined in @workflow/core since that's where they're used.

The HealthCheckPayloadSchema is still part of QueuePayloadSchema in
world (so the queue accepts health check messages), but it's not
exported from the public API.

* .

* Refactor health check implementation based on code review feedback (#746)

* Initial plan

* Address PR review comments: export types, fix race condition, improve error handling

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Add queue-based health check test and document security considerations

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Replace 'any' type with proper type guards for health check response

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* Extract health check queue names as constants and improve type guards

Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>

* .

* Fix e2e test

* .

* .

* .

* .

* .

* Update packages/world/src/queue.ts

Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>

* Use temporary workflow-server deployment URL

* .

* .

---------

Signed-off-by: Peter Wielander <mittgfu@gmail.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: TooTallNate <71256+TooTallNate@users.noreply.github.com>
Co-authored-by: Pranay Prakash <pranay.gp@gmail.com>
Co-authored-by: Peter Wielander <mittgfu@gmail.com>
Co-authored-by: Vercel Release Bot <88769842+vercel-release-bot@users.noreply.github.com>
Co-authored-by: JJ Kasper <jj@jjsweb.site>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini tool-calls fail after first step because thought_signature is dropped

3 participants