Bug
A couple of runtime tests are too sensitive to machine speed and process startup latency, so the full suite can fail intermittently on slower environments even when the implementation is correct.
Symptoms
The failures show up in background-task tests such as:
task --background enqueues a detached worker and exposes per-job status
cancel sends turn interrupt to the shared app-server before killing a brokered task
Observed failures include:
- expecting a job to reach
completed, but it is still running
- timing out while waiting for a brokered background task to expose
threadId / turnId
Root Cause
The tests assume a very short end-to-end startup window for detached workers and shared-broker task startup. In practice, the sequence includes:
- detached worker spawn
- Node process startup
- broker startup / reuse
- fake Codex app-server startup
- task turn start and state persistence
The existing wait windows are tight enough that normal startup variance can make the assertions flaky.
The interruptible fake task also completes quickly enough that cancellation tests can race with normal completion.
Impact
npm test can fail on slower machines for timing reasons rather than functional regressions.
- It becomes harder to trust red builds.
- Contributors may chase nonexistent runtime bugs.
Suggested Fix
Relax the wait windows for the affected background-task tests and widen the interrupt window in the fake fixture so cancellation assertions reliably observe a cancellable in-flight task.
Bug
A couple of runtime tests are too sensitive to machine speed and process startup latency, so the full suite can fail intermittently on slower environments even when the implementation is correct.
Symptoms
The failures show up in background-task tests such as:
task --background enqueues a detached worker and exposes per-job statuscancel sends turn interrupt to the shared app-server before killing a brokered taskObserved failures include:
completed, but it is stillrunningthreadId/turnIdRoot Cause
The tests assume a very short end-to-end startup window for detached workers and shared-broker task startup. In practice, the sequence includes:
The existing wait windows are tight enough that normal startup variance can make the assertions flaky.
The interruptible fake task also completes quickly enough that cancellation tests can race with normal completion.
Impact
npm testcan fail on slower machines for timing reasons rather than functional regressions.Suggested Fix
Relax the wait windows for the affected background-task tests and widen the interrupt window in the fake fixture so cancellation assertions reliably observe a cancellable in-flight task.