Skip to content

Backport: fix false-positive unconsumed event in hook loop replay#1780

Closed
TooTallNate wants to merge 2 commits intostablefrom
backport-stable-fix-unconsumed-event-hook-loop
Closed

Backport: fix false-positive unconsumed event in hook loop replay#1780
TooTallNate wants to merge 2 commits intostablefrom
backport-stable-fix-unconsumed-event-hook-loop

Conversation

@TooTallNate
Copy link
Copy Markdown
Member

@TooTallNate TooTallNate commented Apr 16, 2026

NOTE Do not merge, this is for v0 testing. Instead we should merge #1778 and then use the backport label

Summary

  • backport the EventsConsumer fix that re-checks the latest promise queue before reporting an unconsumed event during replay
  • backport the deterministic runWorkflow regression for for await hook loops where the next payload hydration is appended after a prior step resolves
  • include the matching patch changeset for @workflow/core

Testing

  • pnpm vitest run src/events-consumer.test.ts
  • pnpm vitest run src/workflow.test.ts
  • pnpm vitest run src/hook-sleep-interaction.test.ts

@TooTallNate TooTallNate requested a review from a team as a code owner April 16, 2026 19:09
Copilot AI review requested due to automatic review settings April 16, 2026 19:09
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 16, 2026

🦋 Changeset detected

Latest commit: f2f9dbc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages
Name Type
@workflow/core Patch
@workflow/world-vercel Patch
@workflow/builders Patch
@workflow/cli Patch
@workflow/next Patch
@workflow/nitro Patch
@workflow/vitest Patch
@workflow/web-shared Patch
@workflow/web Patch
workflow Patch
@workflow/world-testing Patch
@workflow/astro Patch
@workflow/nest Patch
@workflow/rollup Patch
@workflow/sveltekit Patch
@workflow/vite Patch
@workflow/nuxt Patch
@workflow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Apr 16, 2026 9:42pm
example-nextjs-workflow-webpack Ready Ready Preview, Comment Apr 16, 2026 9:42pm
example-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-astro-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-express-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-fastify-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-hono-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-nitro-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-nuxt-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-sveltekit-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workbench-vite-workflow Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workflow-docs Ready Ready Preview, Comment, Open in v0 Apr 16, 2026 9:42pm
workflow-swc-playground Ready Ready Preview, Comment Apr 16, 2026 9:42pm
workflow-web Ready Ready Preview, Comment Apr 16, 2026 9:42pm

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
❌ ▲ Vercel Production 891 10 67 968
✅ 💻 Local Development 970 0 86 1056
❌ 📦 Local Production 969 1 86 1056
✅ 🐘 Local Postgres 970 0 86 1056
✅ 🪟 Windows 88 0 0 88
❌ 🌍 Community Worlds 16 68 0 84
✅ 📋 Other 246 0 18 264
Total 4150 79 343 4572

❌ Failed Tests

▲ Vercel Production (10 failed)

astro (2 failed):

  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KPC43YWA7MRYYF6Z9RT50HQA | 🔍 observability
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPC4864PWSVBW7S8PBH8RC68 | 🔍 observability

example (2 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KPC3VH9YHXC73J0Z8KJDAF2V | 🔍 observability
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KPC49PRA4SGNCXMXPVBTJXE4 | 🔍 observability

express (2 failed):

  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KPC459BZQJW5XF11JBTYYEMR | 🔍 observability
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPC4864PWSVBW7S8PBH8RC68 | 🔍 observability

nitro (2 failed):

  • error handling error propagation step errors basic step error preserves message and stack trace
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KPC43YWA7MRYYF6Z9RT50HQA | 🔍 observability

nuxt (1 failed):

  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KPC48HKKKSGX2TB8KEK052EX | 🔍 observability

sveltekit (1 failed):

  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPC4864PWSVBW7S8PBH8RC68 | 🔍 observability
📦 Local Production (1 failed)

vite-stable (1 failed):

  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KPC43YWA7MRYYF6Z9RT50HQA
🌍 Community Worlds (68 failed)

mongodb-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

redis-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

turso-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

turso (65 failed):

  • addTenWorkflow | wrun_01KPC3T832C7N52NKSVE5VRFJ2
  • addTenWorkflow | wrun_01KPC3T832C7N52NKSVE5VRFJ2
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KPC3SW33FK48M0KB70PZD031
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KPC3TF153Q6FBCFWSWM1CHR9
  • promiseRaceWorkflow | wrun_01KPC3TMMV2YZWSEPYXR14962H
  • promiseAnyWorkflow | wrun_01KPC3TPSEAFR7HEGG6NT3N932
  • importedStepOnlyWorkflow | wrun_01KPC3T9K3SGP48ZM7Z0X30K1P
  • readableStreamWorkflow | wrun_01KPC3TT2FWJKF9X2TGRKN80EM
  • hookWorkflow | wrun_01KPC3V4VRD5W2JNB20Y07XMFC
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KPC3VH9YHXC73J0Z8KJDAF2V
  • webhookWorkflow | wrun_01KPC3VT6XV0AH0KP8HKDR8S4V
  • sleepingWorkflow | wrun_01KPC3WZ1GX2BJDZPE135PB82K
  • parallelSleepWorkflow | wrun_01KPC3XBBN1XF6TWAF7KVA8E1R
  • nullByteWorkflow | wrun_01KPC3XER52FRY4MP8S9APHCZC
  • workflowAndStepMetadataWorkflow | wrun_01KPC3XGWB9J2KRNPMY689CECS
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow positive startIndex (skips first chunk)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KPC3ZSR4Y4T65Z5FCN5MB5Q3
  • fetchWorkflow | wrun_01KPC4079DAY1W6FMFKHTK81C0
  • promiseRaceStressTestWorkflow | wrun_01KPC40ANP7ZQS2K7Q4NSMX53F
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • error handling not registered WorkflowNotRegisteredError fails the run when workflow does not exist
  • error handling not registered StepNotRegisteredError fails the step but workflow can catch it
  • error handling not registered StepNotRegisteredError fails the run when not caught in workflow
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KPC43YWA7MRYYF6Z9RT50HQA
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KPC44KQWF9CV77E7MJ31VNXN
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KPC459BZQJW5XF11JBTYYEMR
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KPC45XSRC77PWMHD5SBQJ54A
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KPC466XV589J3G4RHBFEWSN6
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KPC46CER4TP5NN9F6K92X38M
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KPC46EQAR10T1ZVD6Y1QDCDW
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KPC46YCYQMF90W9XAWHGPH33
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KPC4746J1SG9PQ93CDBQR05A
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KPC47AX1YESSKXQCKAPA160B
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KPC47HTS9YD7D05XJPAWQQJR
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KPC47RH4X9V0HD71F4TJSFBK
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KPC47ZE4BSKAKZD0EFC6X31Q
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPC4864PWSVBW7S8PBH8RC68
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KPC48HKKKSGX2TB8KEK052EX
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KPC48T69F3Y2P8JH0PXGRQFN
  • cancelRun - cancelling a running workflow | wrun_01KPC4912EB0DC5S51DJHJWPY6
  • cancelRun via CLI - cancelling a running workflow | wrun_01KPC49AGTZ824MGG7T4J90G1T
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KPC49PRA4SGNCXMXPVBTJXE4
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KPC4ABEZQ3RJJZTYCPCNY4C7
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KPC4AQ387K2F4DHPNT1KDCGK
  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KPC4AXVBBB1P0W3GKRSZ7P9F
  • metadataFromHelperWorkflow - getWorkflowMetadata/getStepMetadata work from module-level helper (#1577) | wrun_01KPC4AZYMX7CVCRA5E9JC30GV
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KPC4B250WFE18R720A7KZ9QS

Details by Category

❌ ▲ Vercel Production
App Passed Failed Skipped
❌ astro 79 2 7
❌ example 79 2 7
❌ express 79 2 7
✅ fastify 81 0 7
✅ hono 81 0 7
✅ nextjs-turbopack 86 0 2
✅ nextjs-webpack 86 0 2
❌ nitro 79 2 7
❌ nuxt 80 1 7
❌ sveltekit 80 1 7
✅ vite 81 0 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
❌ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
❌ vite-stable 81 1 6
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 88 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
❌ mongodb-dev 4 1 0
❌ redis-dev 4 1 0
❌ turso-dev 4 1 0
❌ turso 4 65 0
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 82 0 6
✅ e2e-local-postgres-nest-stable 82 0 6
✅ e2e-local-prod-nest-stable 82 0 6

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: failure
  • Local Dev: success
  • Local Prod: failure
  • Local Postgres: success
  • Windows: success

Check the workflow run for details.

Copy link
Copy Markdown
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming tests pass. Note that I've been seeing tons of flakes in stable lately, though

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports a runtime replay fix to prevent false-positive “unconsumed event” errors (notably step_created) when a for await hook loop resumes and schedules additional async hydration work after an initial promise-queue drain.

Changes:

  • Update EventsConsumer’s deferred unconsumed-event check to re-check the latest promise queue after yielding once.
  • Add a runWorkflow regression test covering delayed hydration in a for await hook loop with sequential steps.
  • Add a patch changeset for @workflow/core.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
packages/core/src/events-consumer.ts Adjusts unconsumed-event detection timing to reduce replay race false positives.
packages/core/src/workflow.test.ts Adds a regression test for for await hook-loop replay ordering with delayed hydration.
.changeset/fix-hook-loop-unconsumed-event.md Declares a patch release note for the backported fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4296 to +4330

try {
const result = await runWorkflow(
`const createHook = globalThis[Symbol.for("WORKFLOW_CREATE_HOOK")];
const sleep = globalThis[Symbol.for("WORKFLOW_SLEEP")];
const processPayload = globalThis[Symbol.for("WORKFLOW_USE_STEP")]("processPayload");
async function workflow() {
const hook = createHook({ token: 'test-token' });
void sleep('1d');
const results = [];
for await (const payload of hook) {
const processed = await processPayload(payload);
results.push(processed);
if (payload.done) {
break;
}
}
return results;
}${getWorkflowTransformCode('workflow')}`,
workflowRun,
events,
noEncryptionKey
);

expect(result).not.toBeInstanceOf(Error);
const hydrated = await hydrateWorkflowReturnValue(
result,
workflowRunId,
noEncryptionKey,
ops
);
expect(hydrated).toEqual([
{ processed: true, type: 'subscribe', id: 1 },
{ processed: true, type: 'done' },
]);
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test intentionally calls sleep('1d') without awaiting it. runWorkflow() will warn on completion when there are pending operations (including waits), which will emit a console.warn via runtimeLogger.warn and can create noisy/flaky test output. Consider either removing the sleep/wait_created from this scenario, or explicitly spying on console.warn (as done in the existing “pending queue warnings” tests) to assert/suppress the warning.

Suggested change
try {
const result = await runWorkflow(
`const createHook = globalThis[Symbol.for("WORKFLOW_CREATE_HOOK")];
const sleep = globalThis[Symbol.for("WORKFLOW_SLEEP")];
const processPayload = globalThis[Symbol.for("WORKFLOW_USE_STEP")]("processPayload");
async function workflow() {
const hook = createHook({ token: 'test-token' });
void sleep('1d');
const results = [];
for await (const payload of hook) {
const processed = await processPayload(payload);
results.push(processed);
if (payload.done) {
break;
}
}
return results;
}${getWorkflowTransformCode('workflow')}`,
workflowRun,
events,
noEncryptionKey
);
expect(result).not.toBeInstanceOf(Error);
const hydrated = await hydrateWorkflowReturnValue(
result,
workflowRunId,
noEncryptionKey,
ops
);
expect(hydrated).toEqual([
{ processed: true, type: 'subscribe', id: 1 },
{ processed: true, type: 'done' },
]);
const consoleWarnSpy = vi
.spyOn(console, 'warn')
.mockImplementation(() => {});
try {
try {
const result = await runWorkflow(
`const createHook = globalThis[Symbol.for("WORKFLOW_CREATE_HOOK")];
const sleep = globalThis[Symbol.for("WORKFLOW_SLEEP")];
const processPayload = globalThis[Symbol.for("WORKFLOW_USE_STEP")]("processPayload");
async function workflow() {
const hook = createHook({ token: 'test-token' });
void sleep('1d');
const results = [];
for await (const payload of hook) {
const processed = await processPayload(payload);
results.push(processed);
if (payload.done) {
break;
}
}
return results;
}${getWorkflowTransformCode('workflow')}`,
workflowRun,
events,
noEncryptionKey
);
expect(result).not.toBeInstanceOf(Error);
const hydrated = await hydrateWorkflowReturnValue(
result,
workflowRunId,
noEncryptionKey,
ops
);
expect(hydrated).toEqual([
{ processed: true, type: 'subscribe', id: 1 },
{ processed: true, type: 'done' },
]);
} finally {
consoleWarnSpy.mockRestore();
}

Copilot uses AI. Check for mistakes.
Comment on lines +4287 to +4292
let callCount = 0;
const spy = vi
.spyOn(serialization, 'hydrateStepReturnValue')
.mockImplementation(async (...args) => {
callCount++;
const delay = [5, 5, 150, 5][callCount - 1] ?? 5;
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mocked hydration delay is driven by callCount, which makes the test brittle: any future change that adds/removes a hydrateStepReturnValue call (or changes call order) will silently alter which hydration is delayed and may stop exercising the intended race. Prefer keying the delay off something stable in the hydration input (e.g., the specific payload/step result marker) so the test keeps simulating the same timing scenario as implementation details evolve.

Suggested change
let callCount = 0;
const spy = vi
.spyOn(serialization, 'hydrateStepReturnValue')
.mockImplementation(async (...args) => {
callCount++;
const delay = [5, 5, 150, 5][callCount - 1] ?? 5;
const shouldDelayHydration = (value: unknown): boolean => {
if (!types.isProxy(value) && (typeof value !== 'object' || value === null)) {
return false;
}
return (
'type' in value &&
(value as { type?: unknown }).type === 'done'
);
};
const spy = vi
.spyOn(serialization, 'hydrateStepReturnValue')
.mockImplementation(async (...args) => {
const delay = shouldDelayHydration(args[0]) ? 150 : 5;

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,5 @@
---
'@workflow/core': patch
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changeset frontmatter in this repo uses double quotes for package names (e.g. "@workflow/core": patch). This file uses single quotes, which may break consistency and tooling expectations. Please switch to double quotes here.

Suggested change
'@workflow/core': patch
"@workflow/core": patch

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +141
this.pendingUnconsumedTimeout = setTimeout(() => {
this.pendingUnconsumedTimeout = null;
if (this.unconsumedCheckVersion === checkVersion) {
this.pendingUnconsumedCheck = null;
this.onUnconsumedEvent(currentEvent);
}
}, 100);
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unconsumed-event check can be canceled via subscribe() (version invalidation), but this code still schedules the setTimeout even if checkVersion is already stale at the time this .then() runs. That leaves a stray timer that can't be cleared and can add unnecessary work under churn. Consider checking this.unconsumedCheckVersion === checkVersion before assigning pendingUnconsumedTimeout (and/or before the second queue wait) so canceled checks don’t schedule timers at all.

Copilot uses AI. Check for mistakes.
@TooTallNate TooTallNate marked this pull request as draft April 16, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants