Summary
Two unit tests fail intermittently in CI (GitHub Actions Linux runner), causing unit (linux) job to report failure. These are pre-existing flakes unrelated to any recent PR changes.
Failing Tests
1. prompt submitted during an active run is included in the next LLM input
- Location: Likely in
test/session/ or test/prompt/
- Failure mode: Timing-dependent — test expects a prompt submitted during an active LLM run to be queued and included in the next input. Race condition between prompt submission and run completion.
- Frequency: Fails consistently in CI, may pass locally due to faster I/O
- Root cause hypothesis: Non-deterministic timing in async prompt submission pipeline. Test lacks explicit synchronization barrier.
- Suggested fix: Add explicit
waitFor/flush before assertion, or mock the timing to remove non-determinism.
2. hook.execute > runHook > timeout returns pass
- Location: Likely in
test/hook/
- Failure mode: Takes 5000.98ms — just barely exceeds the 5000ms timeout threshold
- Frequency: Fails in CI where runners are slower, passes locally
- Root cause hypothesis: The timeout boundary is too tight for CI environments. The test expects the hook to resolve within exactly 5000ms but CI jitter pushes it over.
- Suggested fix: Either increase the timeout threshold by a small margin (e.g., 5100ms) or use a more generous timeout in CI environments via
process.env.CI detection.
Impact
- These failures are non-blocking for PRs that don't touch session/hook code
unit results (linux) secondary job also reports failure as a consequence
- Developers may waste time investigating false positives
Evidence
CI logs from PR #117-#120 (all show identical failures):
(fail) prompt submitted during an active run is included in the next LLM input [283.00ms]
(fail) hook.execute > runHook > timeout returns pass [5000.98ms]
Suggested Actions
Ref
Observed across PRs #117, #118, #119, #120 (2026-04-06)
Summary
Two unit tests fail intermittently in CI (GitHub Actions Linux runner), causing
unit (linux)job to report failure. These are pre-existing flakes unrelated to any recent PR changes.Failing Tests
1.
prompt submitted during an active run is included in the next LLM inputtest/session/ortest/prompt/waitFor/flushbefore assertion, or mock the timing to remove non-determinism.2.
hook.execute > runHook > timeout returns passtest/hook/process.env.CIdetection.Impact
unit results (linux)secondary job also reports failure as a consequenceEvidence
CI logs from PR #117-#120 (all show identical failures):
Suggested Actions
@flakytag or retry annotation as interim measureRef
Observed across PRs #117, #118, #119, #120 (2026-04-06)