Skip to content

Preserve reasoning content in DurableAgent conversation history#1444

Merged
VaguelySerious merged 9 commits intomainfrom
gr2m/preserve-reasoning-content
Apr 3, 2026
Merged

Preserve reasoning content in DurableAgent conversation history#1444
VaguelySerious merged 9 commits intomainfrom
gr2m/preserve-reasoning-content

Conversation

@gr2m
Copy link
Copy Markdown
Contributor

@gr2m gr2m commented Mar 18, 2026

Summary

Closes #1393

  • Include reasoning content parts in the assistant message alongside tool-call parts in stream-text-iterator.ts, mirroring toResponseMessages() in the AI SDK
  • Remove sanitizeProviderMetadataForToolCall() and OpenAI itemId stripping — with reasoning items preserved, itemId references become valid
  • Fix chunksToStep() to preserve providerMetadata on reasoning parts (needed for OpenAI Responses API item_reference)
  • Fix chunksToStep() to aggregate reasoning from reasoning-start chunks, not just reasoning-delta — encrypted reasoning (OpenAI o-series) emits no deltas
  • Update tests to verify reasoning preservation and reflect the removal of itemId sanitization

Manual validation (OpenAI Responses API, #880)

The following script reproduces the error from #880 and confirms the fix. It requires an OPENAI_API_KEY env var with access to o4-mini.

Save as packages/ai/test-openai-reasoning.mjs and run with cd packages/ai && node test-openai-reasoning.mjs:

import { createOpenAI } from '@ai-sdk/openai';

const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
const model = openai.responses('o4-mini');

// Step 1: Send prompt that triggers a tool call with reasoning
console.log('Step 1: Sending initial request to trigger tool call...');
const result1 = await model.doStream({
  prompt: [
    { role: 'system', content: 'You are a helpful assistant. Always use the getWeather tool.' },
    { role: 'user', content: [{ type: 'text', text: 'What is the weather in San Francisco?' }] },
  ],
  tools: [{
    type: 'function',
    name: 'getWeather',
    parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] },
  }],
  toolChoice: { type: 'required' },
});

const chunks = [];
const reader = result1.stream.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  chunks.push(value);
}

const reasoningChunks = chunks.filter(c =>
  c.type === 'reasoning-start' || c.type === 'reasoning-delta' || c.type === 'reasoning-end'
);
const toolCallChunks = chunks.filter(c => c.type === 'tool-call');
console.log(`  Reasoning chunks: ${reasoningChunks.length}, Tool call chunks: ${toolCallChunks.length}`);

// Step 2: Build reasoning parts (mirrors chunksToStep + stream-text-iterator fix)
const reasoningById = new Map();
for (const chunk of reasoningChunks) {
  if (chunk.type === 'reasoning-start') {
    reasoningById.set(chunk.id, { text: '', providerMetadata: chunk.providerMetadata });
  } else if (chunk.type === 'reasoning-delta') {
    const entry = reasoningById.get(chunk.id);
    if (entry) {
      entry.text += chunk.delta;
      if (chunk.providerMetadata) entry.providerMetadata = chunk.providerMetadata;
    }
  }
}
const reasoningParts = Array.from(reasoningById.values()).map(r => ({
  type: 'reasoning',
  text: r.text,
  ...(r.providerMetadata != null ? { providerOptions: r.providerMetadata } : {}),
}));
const toolCallParts = toolCallChunks.map(tc => ({
  type: 'tool-call',
  toolCallId: tc.toolCallId,
  toolName: tc.toolName,
  input: JSON.parse(tc.input),
  ...(tc.providerMetadata != null ? { providerOptions: tc.providerMetadata } : {}),
}));

const toolResultContent = toolCallChunks.map(tc => ({
  type: 'tool-result',
  toolCallId: tc.toolCallId,
  toolName: tc.toolName,
  output: { type: 'text', value: JSON.stringify({ city: 'San Francisco', temperature: 62, condition: 'partly cloudy' }) },
}));

// Step 3: Follow-up WITH reasoning (the fix)
console.log('\nStep 3: Follow-up WITH reasoning preserved...');
try {
  const result2 = await model.doStream({
    prompt: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: [{ type: 'text', text: 'What is the weather in San Francisco?' }] },
      { role: 'assistant', content: [...reasoningParts, ...toolCallParts] },
      { role: 'tool', content: toolResultContent },
    ],
    tools: [{ type: 'function', name: 'getWeather', parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] } }],
  });
  const reader2 = result2.stream.getReader();
  let text = '';
  while (true) {
    const { done, value } = await reader2.read();
    if (done) break;
    if (value.type === 'text-delta') text += value.delta;
  }
  console.log(`  ✅ SUCCESS: ${text.slice(0, 150)}`);
} catch (e) {
  console.error(`  ❌ FAILED: ${e.message}`);
  process.exit(1);
}

// Step 4: Follow-up WITHOUT reasoning (old behavior — should fail)
console.log('\nStep 4: Follow-up WITHOUT reasoning (old behavior)...');
try {
  const result3 = await model.doStream({
    prompt: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: [{ type: 'text', text: 'What is the weather in San Francisco?' }] },
      { role: 'assistant', content: [...toolCallParts] }, // no reasoning parts
      { role: 'tool', content: toolResultContent },
    ],
    tools: [{ type: 'function', name: 'getWeather', parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] } }],
  });
  const reader3 = result3.stream.getReader();
  while (true) { const { done } = await reader3.read(); if (done) break; }
  console.log('  (Succeeded unexpectedly — model may not have used reasoning)');
} catch (e) {
  console.log(`  ❌ Old behavior failed as expected: ${e.message.slice(0, 120)}`);
}

Expected output:

Step 1: Sending initial request to trigger tool call...
  Reasoning chunks: 2, Tool call chunks: 1

Step 3: Follow-up WITH reasoning preserved...
  ✅ SUCCESS: The current weather in San Francisco is partly cloudy with a temperature of 62°F.

Step 4: Follow-up WITHOUT reasoning (old behavior)...
  ❌ Old behavior failed as expected: Item 'fc_...' of type 'function_call' was provided without its required 'reasoning' item: 'rs_...'

Test plan

…n history

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Mar 18, 2026

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 18, 2026

🦋 Changeset detected

Latest commit: 403261f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@workflow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 18, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
❌ ▲ Vercel Production 867 1 67 935
✅ 💻 Local Development 842 0 178 1020
✅ 📦 Local Production 842 0 178 1020
✅ 🐘 Local Postgres 842 0 178 1020
✅ 🪟 Windows 77 0 8 85
❌ 🌍 Community Worlds 135 60 24 219
✅ 📋 Other 213 0 42 255
Total 3818 61 675 4554

❌ Failed Tests

▲ Vercel Production (1 failed)

vite (1 failed):

🌍 Community Worlds (60 failed)

mongodb (2 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KNAK5WH0H83KQ454GT76MS40
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KNAKHM6PTR4TKBA2R398XJGP

redis (2 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KNAK5WH0H83KQ454GT76MS40
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KNAKHM6PTR4TKBA2R398XJGP

turso (56 failed):

  • addTenWorkflow | wrun_01KNAK4K3DSZ5NQ4DRQSQXRNN7
  • addTenWorkflow | wrun_01KNAK4K3DSZ5NQ4DRQSQXRNN7
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KNAK6B0K5HCATRYCEX09ACA1
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KNAK4X78CFMGGHFHTCGJ8EV2
  • promiseRaceWorkflow | wrun_01KNAK51QF4RXECH82M926KFWC
  • promiseAnyWorkflow | wrun_01KNAK53YSX1J79VJ4XRRZS809
  • importedStepOnlyWorkflow | wrun_01KNAK6QTD7BFBWKKK95QCGA1R
  • hookWorkflow | wrun_01KNAK5GEPGS4KVTN1S9DRHQE0
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KNAK5WH0H83KQ454GT76MS40
  • webhookWorkflow | wrun_01KNAK65FNVV0XTHJK4CDMMSX8
  • sleepingWorkflow | wrun_01KNAK9W0GFH2D7WMZXACA1SZ2
  • parallelSleepWorkflow | wrun_01KNAKA8PQD2SYGCAXWR9HVE51
  • nullByteWorkflow | wrun_01KNAKAD6QD6RJPGTS4WR8JHGT
  • workflowAndStepMetadataWorkflow | wrun_01KNAKAFGF21MRNF07PR4H6JE9
  • fetchWorkflow | wrun_01KNAKDC1ATQDKHJ6HEEKMYHVM
  • promiseRaceStressTestWorkflow | wrun_01KNAKDFF1NFYTND6QBF5MEFYP
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • error handling not registered WorkflowNotRegisteredError fails the run when workflow does not exist
  • error handling not registered StepNotRegisteredError fails the step but workflow can catch it
  • error handling not registered StepNotRegisteredError fails the run when not caught in workflow
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KNAKGZM059S1SA4DDB983WNR
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KNAKHM6PTR4TKBA2R398XJGP
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KNAKJ92DS3VDBWN5AN6398JJ
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KNAKJWDZX4VEBN144QV8HTBR
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KNAKK4VVHMCSN27C7HE97F62
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KNAKKA9JY2Q5NBCGYPN22ARF
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KNAKKCG9CT100SAJZM6ZDP0J
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KNAKKV8CGF67R69AYYXNHQTA
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KNAKM17C3RECCRJ1PCJ3PQN1
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KNAKM7RC8T75GQT0T4WYE7NM
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KNAKME5JB19SFQQPEFVVFNE4
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KNAKMN3ZWEEHJSV3371QKRYQ
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KNAKMVQFV4RX6WQ2ZBS87G2R
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KNAKN33PJSD7MAATP32VNEAY
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KNAKNETVQZ9JYM2PXMKG8DJF
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KNAKNQJTY80ZDC1ZKV449TSY
  • cancelRun - cancelling a running workflow | wrun_01KNAKNY2K9XPPSD08TSRHVFEM
  • cancelRun via CLI - cancelling a running workflow | wrun_01KNAKP70T0P8JXEATWX05K0PA
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KNAKPJEM1VFCK4RDKBG4WP2T
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KNAKQ6ZVSPP3KQB0QM43W685
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KNAKQJ22025W8JWVWA5YTG18
  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KNAKQRKDPQ4TEHS6DHDVNSKC
  • metadataFromHelperWorkflow - getWorkflowMetadata/getStepMetadata work from module-level helper (#1577) | wrun_01KNAKQV12S7NW7BRVXB5EXWMZ

Details by Category

❌ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 78 0 7
✅ example 78 0 7
✅ express 78 0 7
✅ fastify 78 0 7
✅ hono 78 0 7
✅ nextjs-turbopack 83 0 2
✅ nextjs-webpack 83 0 2
✅ nitro 78 0 7
✅ nuxt 78 0 7
✅ sveltekit 78 0 7
❌ vite 77 1 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 71 0 14
✅ express-stable 71 0 14
✅ fastify-stable 71 0 14
✅ hono-stable 71 0 14
✅ nextjs-turbopack-canary 60 0 25
✅ nextjs-turbopack-stable 77 0 8
✅ nextjs-webpack-canary 60 0 25
✅ nextjs-webpack-stable 77 0 8
✅ nitro-stable 71 0 14
✅ nuxt-stable 71 0 14
✅ sveltekit-stable 71 0 14
✅ vite-stable 71 0 14
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 71 0 14
✅ express-stable 71 0 14
✅ fastify-stable 71 0 14
✅ hono-stable 71 0 14
✅ nextjs-turbopack-canary 60 0 25
✅ nextjs-turbopack-stable 77 0 8
✅ nextjs-webpack-canary 60 0 25
✅ nextjs-webpack-stable 77 0 8
✅ nitro-stable 71 0 14
✅ nuxt-stable 71 0 14
✅ sveltekit-stable 71 0 14
✅ vite-stable 71 0 14
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 71 0 14
✅ express-stable 71 0 14
✅ fastify-stable 71 0 14
✅ hono-stable 71 0 14
✅ nextjs-turbopack-canary 60 0 25
✅ nextjs-turbopack-stable 77 0 8
✅ nextjs-webpack-canary 60 0 25
✅ nextjs-webpack-stable 77 0 8
✅ nitro-stable 71 0 14
✅ nuxt-stable 71 0 14
✅ sveltekit-stable 71 0 14
✅ vite-stable 71 0 14
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 77 0 8
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 5 0 0
❌ mongodb 58 2 8
✅ redis-dev 5 0 0
❌ redis 58 2 8
✅ turso-dev 5 0 0
❌ turso 4 56 8
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 71 0 14
✅ e2e-local-postgres-nest-stable 71 0 14
✅ e2e-local-prod-nest-stable 71 0 14

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: failure
  • Local Dev: success
  • Local Prod: success
  • Local Postgres: success
  • Windows: success

Check the workflow run for details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 18, 2026

📊 Benchmark Results

📈 Comparing against baseline from main branch. Green 🟢 = faster, Red 🔺 = slower.

workflow with no steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 0.035s (-17.9% 🟢) 1.005s (~) 0.969s 10 1.00x
💻 Local Nitro 0.043s (-17.6% 🟢) 1.005s (~) 0.963s 10 1.20x
💻 Local Next.js (Turbopack) 0.043s 1.005s 0.962s 10 1.21x
🌐 Redis Next.js (Turbopack) 0.055s 1.005s 0.950s 10 1.56x
🐘 Postgres Next.js (Turbopack) 0.058s 1.012s 0.954s 10 1.63x
🐘 Postgres Express 0.059s (-10.7% 🟢) 1.011s (~) 0.952s 10 1.69x
🐘 Postgres Nitro 0.062s (+7.3% 🔺) 1.011s (~) 0.949s 10 1.75x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 0.447s (-12.8% 🟢) 2.638s (-5.5% 🟢) 2.191s 10 1.00x
▲ Vercel Next.js (Turbopack) 0.456s (-28.9% 🟢) 2.168s (-11.2% 🟢) 1.711s 10 1.02x
▲ Vercel Express 0.620s (+25.2% 🔺) 2.582s (+1.9%) 1.962s 10 1.39x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

workflow with 1 step

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 1.099s 2.006s 0.906s 10 1.00x
💻 Local Express 1.103s (-2.9%) 2.006s (~) 0.902s 10 1.00x
💻 Local Nitro 1.126s (-1.4%) 2.006s (~) 0.880s 10 1.02x
🌐 Redis Next.js (Turbopack) 1.132s 2.007s 0.875s 10 1.03x
🐘 Postgres Express 1.137s (-0.5%) 2.010s (~) 0.874s 10 1.03x
🐘 Postgres Nitro 1.144s (-0.6%) 2.012s (~) 0.868s 10 1.04x
🐘 Postgres Next.js (Turbopack) 1.157s 2.011s 0.855s 10 1.05x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.214s (+5.5% 🔺) 3.894s (~) 1.680s 10 1.00x
▲ Vercel Next.js (Turbopack) 2.226s (-1.8%) 3.946s (+10.1% 🔺) 1.719s 10 1.01x
▲ Vercel Express 2.313s (+9.2% 🔺) 4.289s (+11.0% 🔺) 1.976s 10 1.04x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

workflow with 10 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 10.636s (-2.6%) 11.022s (~) 0.386s 3 1.00x
💻 Local Next.js (Turbopack) 10.705s 11.026s 0.321s 3 1.01x
🌐 Redis Next.js (Turbopack) 10.772s 11.024s 0.252s 3 1.01x
🐘 Postgres Next.js (Turbopack) 10.853s 11.027s 0.174s 3 1.02x
🐘 Postgres Express 10.898s (~) 11.024s (~) 0.126s 3 1.02x
🐘 Postgres Nitro 10.911s (~) 11.024s (~) 0.113s 3 1.03x
💻 Local Nitro 10.912s (-0.9%) 11.025s (-2.9%) 0.113s 3 1.03x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 16.413s (-11.2% 🟢) 17.963s (-11.2% 🟢) 1.549s 2 1.00x
▲ Vercel Express 17.162s (~) 19.536s (+2.2%) 2.375s 2 1.05x
▲ Vercel Next.js (Turbopack) 18.085s (-1.5%) 19.595s (-2.5%) 1.509s 2 1.10x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 25 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 14.188s (-5.1% 🟢) 15.028s (~) 0.839s 4 1.00x
💻 Local Next.js (Turbopack) 14.252s 15.030s 0.778s 4 1.00x
🌐 Redis Next.js (Turbopack) 14.319s 15.030s 0.711s 4 1.01x
🐘 Postgres Next.js (Turbopack) 14.480s 15.030s 0.550s 4 1.02x
🐘 Postgres Express 14.520s (-0.7%) 15.028s (~) 0.508s 4 1.02x
🐘 Postgres Nitro 14.621s (+0.8%) 15.023s (~) 0.402s 4 1.03x
💻 Local Nitro 14.906s (-2.0%) 15.029s (-6.3% 🟢) 0.123s 4 1.05x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 31.927s (+0.8%) 33.625s (+0.8%) 1.697s 2 1.00x
▲ Vercel Next.js (Turbopack) 32.316s (-4.2%) 33.738s (-4.6%) 1.421s 2 1.01x
▲ Vercel Nitro 33.452s (-4.3%) 35.203s (-4.9%) 1.751s 2 1.05x

🔍 Observability: Express | Next.js (Turbopack) | Nitro

workflow with 50 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 13.562s 14.026s 0.464s 7 1.00x
🐘 Postgres Next.js (Turbopack) 13.820s 14.022s 0.203s 7 1.02x
🐘 Postgres Nitro 13.971s (~) 14.311s (-2.0%) 0.340s 7 1.03x
🐘 Postgres Express 14.057s (~) 14.600s (+1.0%) 0.543s 7 1.04x
💻 Local Express 14.776s (-11.4% 🟢) 15.027s (-11.8% 🟢) 0.251s 6 1.09x
💻 Local Next.js (Turbopack) 14.974s 15.196s 0.222s 6 1.10x
💻 Local Nitro 16.549s (-3.6%) 17.030s (-5.6% 🟢) 0.481s 6 1.22x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 54.217s (~) 56.280s (+1.5%) 2.063s 2 1.00x
▲ Vercel Nitro 54.619s (-3.3%) 56.276s (-3.9%) 1.657s 2 1.01x
▲ Vercel Express 55.344s (+5.7% 🔺) 57.632s (+6.6% 🔺) 2.287s 2 1.02x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

Promise.all with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Next.js (Turbopack) 1.255s 2.010s 0.755s 15 1.00x
🐘 Postgres Express 1.263s (~) 2.011s (~) 0.749s 15 1.01x
🐘 Postgres Nitro 1.264s (~) 2.012s (~) 0.748s 15 1.01x
🌐 Redis Next.js (Turbopack) 1.286s 2.006s 0.720s 15 1.02x
💻 Local Express 1.470s (-4.6%) 2.006s (~) 0.535s 15 1.17x
💻 Local Next.js (Turbopack) 1.513s 2.005s 0.492s 15 1.21x
💻 Local Nitro 1.521s (-2.9%) 2.006s (~) 0.485s 15 1.21x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.670s (+13.0% 🔺) 4.749s (+17.5% 🔺) 2.079s 7 1.00x
▲ Vercel Next.js (Turbopack) 2.944s (+5.7% 🔺) 4.469s (+15.0% 🔺) 1.525s 7 1.10x
▲ Vercel Nitro 3.034s (+19.1% 🔺) 4.356s (+3.6%) 1.322s 7 1.14x

🔍 Observability: Express | Next.js (Turbopack) | Nitro

Promise.all with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 2.328s (~) 3.010s (~) 0.682s 10 1.00x
🐘 Postgres Express 2.337s (~) 3.010s (~) 0.673s 10 1.00x
🐘 Postgres Next.js (Turbopack) 2.376s 3.011s 0.636s 10 1.02x
🌐 Redis Next.js (Turbopack) 2.538s 3.008s 0.470s 10 1.09x
💻 Local Express 2.713s (-12.8% 🟢) 3.008s (-25.0% 🟢) 0.295s 10 1.17x
💻 Local Next.js (Turbopack) 2.772s 3.008s 0.236s 10 1.19x
💻 Local Nitro 3.073s (-1.5%) 3.565s (-8.2% 🟢) 0.492s 9 1.32x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.871s (+0.5%) 4.297s (-2.6%) 1.426s 7 1.00x
▲ Vercel Next.js (Turbopack) 3.265s (+5.0%) 4.531s (~) 1.266s 7 1.14x
▲ Vercel Express 3.730s (+21.9% 🔺) 5.696s (+20.0% 🔺) 1.966s 6 1.30x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

Promise.all with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 3.454s (-0.5%) 4.009s (~) 0.555s 8 1.00x
🐘 Postgres Express 3.485s (~) 4.012s (~) 0.527s 8 1.01x
🐘 Postgres Next.js (Turbopack) 3.619s 4.012s 0.393s 8 1.05x
🌐 Redis Next.js (Turbopack) 4.143s 4.725s 0.582s 7 1.20x
💻 Local Next.js (Turbopack) 6.814s 7.515s 0.701s 4 1.97x
💻 Local Express 6.938s (-14.9% 🟢) 7.414s (-17.8% 🟢) 0.477s 5 2.01x
💻 Local Nitro 7.991s (-4.7%) 8.523s (-5.6% 🟢) 0.533s 4 2.31x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 3.029s (-21.2% 🟢) 5.112s (-7.5% 🟢) 2.082s 6 1.00x
▲ Vercel Nitro 3.192s (-6.9% 🟢) 4.820s (-7.2% 🟢) 1.627s 7 1.05x
▲ Vercel Next.js (Turbopack) 3.484s (-15.9% 🟢) 5.654s (-1.5%) 2.170s 6 1.15x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.race with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Next.js (Turbopack) 1.226s 2.010s 0.784s 15 1.00x
🐘 Postgres Express 1.261s (~) 2.009s (~) 0.748s 15 1.03x
🐘 Postgres Nitro 1.273s (+1.2%) 2.009s (~) 0.737s 15 1.04x
🌐 Redis Next.js (Turbopack) 1.295s 2.006s 0.711s 15 1.06x
💻 Local Express 1.467s (-4.9%) 2.005s (~) 0.538s 15 1.20x
💻 Local Next.js (Turbopack) 1.520s 2.006s 0.486s 15 1.24x
💻 Local Nitro 1.539s (-1.7%) 2.006s (~) 0.467s 15 1.26x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 2.257s (+0.6%) 3.995s (+11.0% 🔺) 1.738s 8 1.00x
▲ Vercel Express 2.330s (-5.5% 🟢) 4.215s (+1.9%) 1.885s 8 1.03x
▲ Vercel Nitro 2.362s (-6.6% 🟢) 4.008s (-7.7% 🟢) 1.646s 8 1.05x

🔍 Observability: Next.js (Turbopack) | Express | Nitro

Promise.race with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 2.334s (~) 3.010s (~) 0.676s 10 1.00x
🐘 Postgres Nitro 2.342s (+0.8%) 3.012s (~) 0.670s 10 1.00x
🐘 Postgres Next.js (Turbopack) 2.405s 3.011s 0.606s 10 1.03x
🌐 Redis Next.js (Turbopack) 2.577s 3.008s 0.431s 10 1.10x
💻 Local Express 2.683s (-13.7% 🟢) 3.008s (-22.6% 🟢) 0.325s 10 1.15x
💻 Local Next.js (Turbopack) 2.794s 3.208s 0.413s 10 1.20x
💻 Local Nitro 3.143s (-2.2%) 3.885s (~) 0.743s 8 1.35x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.609s (+0.8%) 4.511s (+14.1% 🔺) 1.901s 7 1.00x
▲ Vercel Nitro 2.662s (-28.9% 🟢) 4.381s (-19.0% 🟢) 1.718s 8 1.02x
▲ Vercel Next.js (Turbopack) 2.733s (-6.2% 🟢) 4.334s (+8.5% 🔺) 1.601s 7 1.05x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.race with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 3.465s (~) 4.011s (~) 0.546s 8 1.00x
🐘 Postgres Nitro 3.486s (+0.8%) 4.010s (~) 0.524s 8 1.01x
🐘 Postgres Next.js (Turbopack) 3.657s 4.013s 0.356s 8 1.06x
🌐 Redis Next.js (Turbopack) 4.282s 4.869s 0.588s 7 1.24x
💻 Local Express 7.636s (-15.2% 🟢) 8.015s (-15.9% 🟢) 0.379s 4 2.20x
💻 Local Next.js (Turbopack) 8.174s 8.768s 0.594s 4 2.36x
💻 Local Nitro 10.163s (+13.6% 🔺) 10.691s (+15.3% 🔺) 0.528s 3 2.93x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 3.031s (-5.6% 🟢) 4.599s (-6.4% 🟢) 1.568s 7 1.00x
▲ Vercel Express 3.376s (+1.1%) 5.128s (+6.3% 🔺) 1.752s 7 1.11x
▲ Vercel Next.js (Turbopack) 3.780s (+2.8%) 5.371s (+4.7%) 1.591s 6 1.25x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 10 sequential data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 0.685s (-31.4% 🟢) 1.004s (-29.5% 🟢) 0.319s 60 1.00x
💻 Local Next.js (Turbopack) 0.715s 1.004s 0.290s 60 1.04x
🌐 Redis Next.js (Turbopack) 0.736s 1.005s 0.269s 60 1.07x
🐘 Postgres Next.js (Turbopack) 0.775s 1.007s 0.232s 60 1.13x
🐘 Postgres Nitro 0.819s (-1.0%) 1.007s (~) 0.188s 60 1.20x
🐘 Postgres Express 0.836s (-1.8%) 1.007s (~) 0.171s 60 1.22x
💻 Local Nitro 0.994s (-3.1%) 1.309s (-31.6% 🟢) 0.315s 46 1.45x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 8.746s (-9.7% 🟢) 10.014s (-11.7% 🟢) 1.268s 6 1.00x
▲ Vercel Express 9.136s (-6.9% 🟢) 10.746s (-10.9% 🟢) 1.611s 6 1.04x
▲ Vercel Next.js (Turbopack) 9.729s (-5.1% 🟢) 11.171s (-5.8% 🟢) 1.442s 6 1.11x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 25 sequential data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 1.766s 2.006s 0.240s 45 1.00x
🐘 Postgres Next.js (Turbopack) 1.947s 2.176s 0.229s 42 1.10x
🐘 Postgres Nitro 1.964s (~) 2.203s (+2.4%) 0.239s 41 1.11x
🐘 Postgres Express 1.966s (-2.8%) 2.228s (-17.0% 🟢) 0.261s 41 1.11x
💻 Local Express 2.240s (-25.7% 🟢) 3.007s (-16.7% 🟢) 0.767s 30 1.27x
💻 Local Next.js (Turbopack) 2.313s 3.008s 0.695s 30 1.31x
💻 Local Nitro 2.950s (-4.5%) 3.146s (-21.6% 🟢) 0.195s 29 1.67x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 26.688s (-11.5% 🟢) 28.849s (-9.8% 🟢) 2.161s 4 1.00x
▲ Vercel Nitro 27.719s (-8.3% 🟢) 29.354s (-8.1% 🟢) 1.635s 4 1.04x
▲ Vercel Next.js (Turbopack) 28.510s (-5.2% 🟢) 29.923s (-5.5% 🟢) 1.413s 4 1.07x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

workflow with 50 sequential data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 3.670s 4.076s 0.406s 30 1.00x
🐘 Postgres Next.js (Turbopack) 3.946s 4.185s 0.239s 29 1.08x
🐘 Postgres Nitro 3.961s (-1.5%) 4.183s (-7.7% 🟢) 0.222s 29 1.08x
🐘 Postgres Express 4.020s (-0.9%) 4.604s (-3.7%) 0.584s 27 1.10x
💻 Local Express 7.308s (-20.1% 🟢) 8.014s (-16.8% 🟢) 0.706s 15 1.99x
💻 Local Next.js (Turbopack) 7.404s 8.015s 0.611s 15 2.02x
💻 Local Nitro 9.048s (-3.9%) 9.402s (-6.2% 🟢) 0.354s 13 2.47x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 71.529s (-14.5% 🟢) 72.724s (-15.4% 🟢) 1.195s 2 1.00x
▲ Vercel Nitro 71.683s (-10.8% 🟢) 74.030s (-10.0% 🟢) 2.347s 2 1.00x
▲ Vercel Express 73.800s (-11.6% 🟢) 75.765s (-11.7% 🟢) 1.964s 2 1.03x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

workflow with 10 concurrent data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Next.js (Turbopack) 0.259s 1.008s 0.749s 60 1.00x
🐘 Postgres Express 0.280s (+0.8%) 1.008s (~) 0.728s 60 1.08x
🐘 Postgres Nitro 0.289s (+6.0% 🔺) 1.008s (~) 0.719s 60 1.11x
🌐 Redis Next.js (Turbopack) 0.314s 1.005s 0.691s 60 1.21x
💻 Local Express 0.559s (-3.8%) 1.004s (~) 0.445s 60 2.16x
💻 Local Nitro 0.571s (-16.4% 🟢) 1.004s (-3.4%) 0.434s 60 2.20x
💻 Local Next.js (Turbopack) 0.616s 1.021s 0.406s 59 2.37x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 1.533s (-13.1% 🟢) 3.268s (-12.9% 🟢) 1.735s 19 1.00x
▲ Vercel Next.js (Turbopack) 1.822s (-8.6% 🟢) 3.762s (-1.8%) 1.939s 16 1.19x
▲ Vercel Express 1.939s (+22.9% 🔺) 3.589s (+15.7% 🔺) 1.650s 17 1.26x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

workflow with 25 concurrent data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 0.487s (~) 1.007s (~) 0.520s 90 1.00x
🐘 Postgres Express 0.496s (~) 1.008s (~) 0.512s 90 1.02x
🐘 Postgres Next.js (Turbopack) 0.511s 1.019s 0.509s 89 1.05x
🌐 Redis Next.js (Turbopack) 1.254s 2.007s 0.753s 45 2.57x
💻 Local Express 2.397s (-6.1% 🟢) 3.007s (~) 0.610s 30 4.92x
💻 Local Nitro 2.489s (-1.6%) 3.009s (~) 0.520s 30 5.11x
💻 Local Next.js (Turbopack) 2.628s 3.008s 0.380s 30 5.39x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 3.033s (-9.6% 🟢) 4.734s (-6.5% 🟢) 1.702s 20 1.00x
▲ Vercel Express 3.440s (-3.0%) 5.208s (-2.4%) 1.768s 18 1.13x
▲ Vercel Next.js (Turbopack) 3.831s (-15.6% 🟢) 5.328s (-12.8% 🟢) 1.497s 17 1.26x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 50 concurrent data payload steps (10KB)

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Next.js (Turbopack) 0.761s 1.008s 0.246s 120 1.00x
🐘 Postgres Nitro 0.773s (~) 1.008s (~) 0.235s 120 1.02x
🐘 Postgres Express 0.785s (-0.8%) 1.008s (~) 0.223s 120 1.03x
🌐 Redis Next.js (Turbopack) 3.058s 3.539s 0.482s 34 4.02x
💻 Local Next.js (Turbopack) 10.309s 10.773s 0.464s 12 13.54x
💻 Local Express 10.536s (-6.0% 🟢) 11.025s (-7.6% 🟢) 0.489s 11 13.84x
💻 Local Nitro 11.021s (-2.9%) 11.483s (-3.8%) 0.462s 11 14.48x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 6.713s (-18.7% 🟢) 8.692s (-11.2% 🟢) 1.979s 14 1.00x
▲ Vercel Nitro 7.276s (-20.7% 🟢) 9.088s (-16.2% 🟢) 1.812s 14 1.08x
▲ Vercel Next.js (Turbopack) 44.157s (+433.3% 🔺) 45.860s (+371.0% 🔺) 1.702s 6 6.58x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Stream Benchmarks (includes TTFB metrics)
workflow with stream

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 0.144s (-29.2% 🟢) 1.003s (~) 0.009s (-23.3% 🟢) 1.014s (~) 0.871s 10 1.00x
💻 Local Next.js (Turbopack) 0.167s 1.002s 0.010s 1.016s 0.850s 10 1.16x
🌐 Redis Next.js (Turbopack) 0.181s 1.000s 0.002s 1.008s 0.827s 10 1.26x
🐘 Postgres Next.js (Turbopack) 0.193s 1.000s 0.001s 1.011s 0.818s 10 1.34x
💻 Local Nitro 0.202s (-7.4% 🟢) 1.003s (~) 0.011s (+0.9%) 1.017s (~) 0.816s 10 1.40x
🐘 Postgres Nitro 0.202s (-3.2%) 0.999s (~) 0.002s (+13.3% 🔺) 1.011s (~) 0.809s 10 1.41x
🐘 Postgres Express 0.205s (~) 0.997s (~) 0.001s (-14.3% 🟢) 1.010s (~) 0.805s 10 1.43x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 1.592s (-20.5% 🟢) 3.114s (+4.5%) 0.314s (-53.4% 🟢) 3.971s (-5.4% 🟢) 2.378s 10 1.00x
▲ Vercel Express 1.883s (+2.4%) 3.227s (+9.1% 🔺) 0.417s (-29.1% 🟢) 4.260s (~) 2.377s 10 1.18x
▲ Vercel Nitro 2.231s (+25.3% 🔺) 3.205s (+9.9% 🔺) 0.495s (-7.4% 🟢) 4.258s (+3.4%) 2.026s 10 1.40x

🔍 Observability: Next.js (Turbopack) | Express | Nitro

stream pipeline with 5 transform steps (1MB)

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 0.571s 1.001s 0.003s 1.012s 0.442s 60 1.00x
💻 Local Express 0.582s (-19.6% 🟢) 1.009s (~) 0.009s (+5.7% 🔺) 1.022s (~) 0.441s 59 1.02x
🐘 Postgres Next.js (Turbopack) 0.586s 1.008s 0.005s 1.023s 0.437s 59 1.03x
💻 Local Next.js (Turbopack) 0.597s 1.009s 0.010s 1.024s 0.427s 59 1.05x
🐘 Postgres Nitro 0.605s (~) 1.003s (~) 0.004s (-5.5% 🟢) 1.023s (~) 0.418s 59 1.06x
🐘 Postgres Express 0.617s (~) 1.004s (~) 0.004s (-1.7%) 1.023s (~) 0.406s 59 1.08x
💻 Local Nitro 0.728s (-4.2%) 1.010s (~) 0.009s (-9.2% 🟢) 1.022s (~) 0.294s 59 1.28x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 4.097s (-22.9% 🟢) 5.469s (-17.3% 🟢) 0.198s (-66.0% 🟢) 6.185s (-21.3% 🟢) 2.088s 10 1.00x
▲ Vercel Express 4.215s (-2.3%) 5.443s (~) 0.744s (+221.9% 🔺) 6.905s (+6.9% 🔺) 2.690s 9 1.03x
▲ Vercel Next.js (Turbopack) 4.419s (-24.8% 🟢) 5.442s (-21.5% 🟢) 0.185s (-56.5% 🟢) 6.556s (-17.3% 🟢) 2.137s 10 1.08x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

10 parallel streams (1MB each)

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Next.js (Turbopack) 0.897s 1.091s 0.000s 1.099s 0.202s 55 1.00x
🐘 Postgres Nitro 0.940s (-3.6%) 1.169s (~) 0.000s (-49.0% 🟢) 1.185s (~) 0.245s 51 1.05x
🐘 Postgres Express 0.985s (+0.8%) 1.210s (+1.0%) 0.000s (+Infinity% 🔺) 1.229s (-0.6%) 0.244s 51 1.10x
🌐 Redis Next.js (Turbopack) 0.993s 1.464s 0.000s 1.469s 0.476s 41 1.11x
💻 Local Express 1.174s (-5.9% 🟢) 2.017s (~) 0.000s (+12.5% 🔺) 2.020s (~) 0.847s 30 1.31x
💻 Local Nitro 1.261s (-2.7%) 2.021s (~) 0.000s (-27.3% 🟢) 2.024s (~) 0.763s 30 1.40x
💻 Local Next.js (Turbopack) 1.296s 2.020s 0.000s 2.024s 0.728s 30 1.44x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 3.001s (+4.0%) 3.911s (-1.4%) 0.000s (+39.3% 🔺) 4.475s (-3.2%) 1.474s 14 1.00x
▲ Vercel Express 3.034s (+1.0%) 4.111s (~) 0.000s (-100.0% 🟢) 4.806s (~) 1.772s 14 1.01x
▲ Vercel Next.js (Turbopack) 41.299s (+1166.7% 🔺) 42.542s (+931.5% 🔺) 0.000s (NaN%) 43.102s (+820.1% 🔺) 1.804s 8 13.76x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

fan-out fan-in 10 streams (1MB each)

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 1.724s (+0.6%) 2.100s (~) 0.000s (~) 2.122s (~) 0.397s 29 1.00x
🐘 Postgres Express 1.762s (+0.7%) 2.064s (~) 0.000s (NaN%) 2.108s (~) 0.346s 29 1.02x
🐘 Postgres Next.js (Turbopack) 1.819s 2.181s 0.000s 2.189s 0.369s 28 1.06x
🌐 Redis Next.js (Turbopack) 1.837s 2.219s 0.000s 2.230s 0.392s 27 1.07x
💻 Local Express 3.367s (-7.0% 🟢) 4.032s (-1.6%) 0.000s (~) 4.035s (-1.6%) 0.668s 15 1.95x
💻 Local Nitro 3.522s (-5.7% 🟢) 4.166s (-1.6%) 0.001s (-27.3% 🟢) 4.170s (-1.7%) 0.648s 15 2.04x
💻 Local Next.js (Turbopack) 3.753s 4.098s 0.001s 4.103s 0.351s 15 2.18x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 3.663s (-2.0%) 5.427s (+7.9% 🔺) 0.000s (-72.5% 🟢) 6.053s (+7.1% 🔺) 2.390s 10 1.00x
▲ Vercel Nitro 3.898s (-6.6% 🟢) 5.071s (-5.7% 🟢) 0.000s (NaN%) 5.585s (-7.4% 🟢) 1.687s 11 1.06x
▲ Vercel Next.js (Turbopack) 4.885s (-7.3% 🟢) 6.202s (~) 0.005s (+Infinity% 🔺) 6.794s (~) 1.909s 9 1.33x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Summary

Fastest Framework by World

Winner determined by most benchmark wins

World 🥇 Fastest Framework Wins
💻 Local Express 18/21
🐘 Postgres Next.js (Turbopack) 14/21
▲ Vercel Nitro 10/21
Fastest World by Framework

Winner determined by most benchmark wins

Framework 🥇 Fastest World Wins
Express 🐘 Postgres 12/21
Next.js (Turbopack) 🐘 Postgres 10/21
Nitro 🐘 Postgres 16/21
Column Definitions
  • Workflow Time: Runtime reported by workflow (completedAt - createdAt) - primary metric
  • TTFB: Time to First Byte - time from workflow start until first stream byte received (stream benchmarks only)
  • Slurp: Time from first byte to complete stream consumption (stream benchmarks only)
  • Wall Time: Total testbench time (trigger workflow + poll for result)
  • Overhead: Testbench overhead (Wall Time - Workflow Time)
  • Samples: Number of benchmark iterations run
  • vs Fastest: How much slower compared to the fastest configuration for this benchmark

Worlds:

  • 💻 Local: In-memory filesystem world (local development)
  • 🐘 Postgres: PostgreSQL database world (local development)
  • ▲ Vercel: Vercel production/preview deployment
  • 🌐 Turso: Community world (local development)
  • 🌐 MongoDB: Community world (local development)
  • 🌐 Redis: Community world (local development)
  • 🌐 Jazz: Community world (local development)

📋 View full workflow run

Include reasoning parts from step results in the assistant message
alongside tool-call parts when building the conversation prompt for
the next tool loop iteration. This mirrors what the AI SDK's
toResponseMessages() does, ensuring reasoning models retain access
to their prior reasoning across multi-step tool loops.

Remove sanitizeProviderMetadataForToolCall() — with reasoning items
now preserved in the conversation, OpenAI's itemId references are
valid and no longer need to be stripped.

Closes #1393

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chunksToStep() previously only collected reasoning-delta chunks. For
models with encrypted reasoning (like o4-mini), there are no deltas —
only reasoning-start + reasoning-end chunks carrying the itemId needed
for OpenAI Responses API item references.

Now aggregates reasoning by ID from reasoning-start chunks, appending
delta text when available. This ensures providerMetadata (including
itemId and reasoningEncryptedContent) flows through even when no
reasoning deltas are emitted.

Verified with o4-mini via OpenAI Responses API:
- With fix: follow-up request accepted
- Without fix: "Item 'fc_...' of type 'function_call' was provided
  without its required 'reasoning' item: 'rs_...'" (#880)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@pranaygp pranaygp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid PR — the approach of mirroring AI SDK's toResponseMessages() is the right call, and the removal of sanitizeProviderMetadataForToolCall follows cleanly from reasoning now being preserved. Tests are thorough (unit + e2e with mock reasoning model). Left a couple of inline comments.

providerMetadata: chunk.providerMetadata,
});
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning-end chunks can also carry providerMetadata (visible in the stream transform at line 345-347 of this same file), but this aggregation loop only processes reasoning-start and reasoning-delta. If a provider attaches metadata exclusively to reasoning-end (rather than reasoning-start), it would be silently dropped.

Probably fine for current providers (OpenAI puts it on reasoning-start), but worth a comment or a defensive reasoning-end handler that merges metadata into the entry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the AI SDK we throw an error when reasoning data is only attached to reasoning-end

https://github.com/vercel/ai/blob/438708ae26b2340c9703c480d179c3b13c01d1af/packages/ai/src/ui/process-ui-message-stream.ts#L449

but we are still capturing it, I updated the PR

...(meta != null ? { providerOptions: meta } : {}),
};
}),
] as typeof toolCalls,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this as typeof toolCalls cast was already here before this PR, but now it's even less accurate — the array genuinely contains reasoning parts that don't conform to LanguageModelV3ToolCall. If the prompt type's content field supports a union of reasoning + tool-call parts, it'd be cleaner to type it properly. Not a blocker though.

- Add reasoning-end handler in chunksToStep() to merge providerMetadata,
  mirroring the AI SDK's behavior where reasoning-end can carry final
  metadata that should override earlier values.
- Replace inaccurate `as typeof toolCalls` cast with
  `as LanguageModelV3Prompt[number]['content']` since the content array
  now contains both reasoning and tool-call parts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +334 to +337
] as Extract<
LanguageModelV3Prompt[number],
{ role: 'assistant' }
>['content'],
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note sure I like this type, we could also revert to as typeof toolCalls with the comment

Suggested change
] as Extract<
LanguageModelV3Prompt[number],
{ role: 'assistant' }
>['content'],
// Cast: content is a mix of reasoning + tool-call parts, both valid
// in an assistant message. typeof toolCalls is imprecise but harmless.
] as typeof toolCalls,

@pranaygp
Copy link
Copy Markdown
Contributor

@gr2m can we add any testing for this? esp. adding or updating to the e2e tests or the durable chat demo inside the next's turbo pack workbench :)

@gr2m
Copy link
Copy Markdown
Contributor Author

gr2m commented Mar 19, 2026

can we add any testing

like this? 5244397 I reverted it because wasn't sure if you want it, shall I put it back in or would you prefer sth else?

@VaguelySerious
Copy link
Copy Markdown
Member

@gr2m The test you briefly added looks good to me

@craze3
Copy link
Copy Markdown
Contributor

craze3 commented Mar 29, 2026

I wanted to confirm this PR looks directly relevant to a failure mode I reproduced locally.

On a long DurableAgent run with openai/gpt-5.4 and high reasoning, I saw:

  • repeated finishReason: 'tool-calls'
  • later finishReason: 'error'
  • Error processing UI message chunks: Error: [object Object]

From local inspection, the currently published package still:

  • strips OpenAI itemId metadata because reasoning items are not preserved
  • stringifies some structured errors down to [object Object]

So this PR appears to address the real underlying cause I’m seeing, not just an edge case. I ended up disabling provider-level reasoning locally as a mitigation, but I’d much rather rely on the reasoning-preservation fix here.

One separate issue still worth tracking even after this lands:

  • fatal stream errors should not collapse to [object Object]
  • consumers need a clearly terminal failure signal, not an ambiguous reconnectable state

This was referenced Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DurableAgent should preserve reasoning content in conversation history across tool loop steps

4 participants