Skip to content

fix(daemon): remove turn completion timeout#121

Merged
rowan-stein merged 3 commits into
mainfrom
noa/issue-120
May 12, 2026
Merged

fix(daemon): remove turn completion timeout#121
rowan-stein merged 3 commits into
mainfrom
noa/issue-120

Conversation

@casey-brooks
Copy link
Copy Markdown
Contributor

Summary

  • Remove active per-turn completion timeout from Codex and Claude execution paths while keeping bounded RPC/start/publish/ack/MCP timeouts.
  • Start keepalive and notification subscriber before initial sync, and retry failed syncs with 1s exponential backoff capped at 30s.
  • Add operation context to daemon/platform RPC errors and expand tests for startup ordering, sync retry, timeout removal, and error wrapping.
  • Allow tracing proxy tests to use an ephemeral listen address so local tests pass when the default OTLP port is occupied.

Closes #120

Test & Lint Summary

  • CGO_ENABLED=0 go test -count=1 ./...
    • Passed: 7 packages
    • Failed: 0
    • Skipped: 0
  • go vet ./...
    • Linting passed with no errors.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Test & Lint Summary

  • CGO_ENABLED=0 go test -count=1 ./...
    • Passed: 7 packages
    • Failed: 0
    • Skipped: 0
  • go vet ./...
    • Linting passed with no errors.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Update

Addressed Noa's blocker:

  • Removed the test's pre-seeded processing state.
  • Added subscriber readiness gating so initial sync starts only after the subscriber is actually subscribed.
  • Added a processing-start wake signal so keepalive touches the workload during the initial sync rather than waiting for the periodic tick.
  • Updated the startup-order test to assert subscriber readiness before sync and real keepalive touch during initial sync.

Test & Lint Summary

  • CGO_ENABLED=0 go test -count=1 ./...
    • Passed: 7 packages
    • Failed: 0
    • Skipped: 0
  • CGO_ENABLED=0 go vet ./...
    • Linting passed with no errors.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Update

Addressed Noa's latest blocker:

  • Changed daemon startup to wait only until the subscriber goroutine has started, not until subscription succeeds.
  • Initial sync now proceeds even if Subscribe is still failing/retrying, so unacked messages continue through sync retry/backoff.
  • Kept subscribe-before-fetch ordering as best-effort: subscriber goroutine starts before initial sync, without deadlocking on notification subscription readiness.
  • Added coverage that initial sync proceeds while subscriber is not ready.

Test & Lint Summary

  • CGO_ENABLED=0 go test -count=1 ./...
    • Passed: 7 packages
    • Failed: 0
    • Skipped: 0
  • CGO_ENABLED=0 go vet ./...
    • Linting passed with no errors.

@rowan-stein
Copy link
Copy Markdown
Collaborator

Noa (reviewer) approved this PR after re-review at commit 10edc3c. Note: her environment cannot post the review due to GH auth (‘Bad credentials’), but approval is complete.

CI: build + e2e are green.

@rowan-stein rowan-stein merged commit 99d5b14 into main May 12, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove per-turn completion timeout; fix agynd startup order; add sync retry/backoff

2 participants