fix(cortex): harden startup warmup and bulletin coordination by vsumner · Pull Request #248 · spacedriveapp/spacebot

vsumner · 2026-02-27T04:40:50Z

Summary

switch startup warmup fanout to JoinSet and enforce a bounded wait that aborts unfinished warmup tasks on timeout
remove unbounded post-timeout drain so startup timeout semantics remain bounded even with non-cooperative warmup tasks
make run_warmup_once cancellation-safe with a guard that demotes stuck Warming state to Degraded with actionable error context
prevent duplicate startup warmup passes by requiring Warm + refresh timestamp for initial-pass completion
add explicit debug observability when bulletin-loop generation is skipped due to fresh warmup bulletin
update cortex/config docs to define stale fallback threshold: bulletin_age_secs >= max(1, warmup.refresh_secs)
add targeted regressions for timeout cancellation, non-cooperative timeout bounds, initial warmup completion gating, and cancellation demotion

Testing

cargo fmt --all
cargo test -q startup_warmup_wait --bin spacebot
cargo test -q startup_warmup_wait_timeout_stays_bounded_for_non_cooperative_task --bin spacebot
cargo test -q cancelled_warmup_demotes_warming_state_to_degraded --lib
cargo test -q initial_warmup_completion_not_detected_when_timestamp_exists_but_state_is_not_warm --lib
cargo test -q bulletin_loop_generation_lock_snapshot_skips_after_fresh_update --lib
just preflight
just gate-pr

Notes (Optional)

Cargo surfaces a non-blocking future-incompat warning for dependency imap-proto v0.10.2; gates remain green.

Note

Changes Summary

This PR hardens the startup warmup and memory bulletin coordination by introducing a bounded timeout for warmup tasks and state machine guards for cancellation safety. Key changes include moving from unbounded async drains to a JoinSet-based approach with explicit abort on timeout, adding a WarmupRunGuard that demotes incomplete warmup states to degraded with error context, gating initial warmup completion by both state and timestamp to prevent duplicate passes, and introducing a bulletin loop snapshot check to skip redundant synthesis when the cached bulletin is still fresh. Documentation updates clarify the relationship between warmup refresh cadence and bulletin staleness thresholds. Five new regression tests ensure timeout bounds hold even for non-cooperative tasks, cancellation safety, and initial completion gating logic.

_{Written by Tembo for commit 3563f11}

coderabbitai · 2026-02-27T04:41:03Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5a050e and f05226c.

📒 Files selected for processing (2)

docs/content/docs/(configuration)/config.mdx
src/main.rs

Walkthrough

Adds a bounded per-agent startup warmup pass and integrates warmup as the primary bulletin refresh when enabled; introduces guarded warmup-run lifecycle (WarmupRunGuard), lock-aware bulletin generation, updated bulletin/warmup loops, docs updates, and related tests.

Changes

Cohort / File(s)	Summary
Documentation `docs/content/docs/(configuration)/config.mdx`, `docs/content/docs/(core)/architecture.mdx`, `docs/content/docs/(core)/cortex.mdx`	Document warmup as primary bulletin refresher when enabled; update readiness/dispatch criteria to reference warmup state and bulletin freshness; describe reordered startup sequence including an initial warmup pass and updated cortex loop behavior.
Agent Cortex Logic `src/agent/cortex.rs`	Introduce warmup gating helpers (`should_generate_bulletin_from_bulletin_loop`, `has_completed_initial_warmup`), `apply_cancelled_warmup_status`, `WarmupRunGuard` with commit/drop semantics, `maybe_generate_bulletin_under_lock`, and refactor bulletin & warmup loops to use guarded generation and lock snapshots; add tests for these behaviors.
Startup Init & Tests `src/main.rs`	Add `wait_for_startup_warmup_tasks` to drain/timeout per-agent startup warmup tasks, spawn bounded (30s) startup warmup before adapters start receiving traffic, and add tests for normal completion, timeout, cancellation, and non-cooperative tasks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

fix(cortex): harden startup warmup and bulletin coordination #248 — Implements similar startup warmup coordination, WarmupRunGuard-style cancellation/demotion behavior, and warmup-driven bulletin refresh with matching docs and tests.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: hardening startup warmup and bulletin coordination, which is the primary focus of the PR across multiple files.
Description check	✅ Passed	The description is directly related to the changeset, providing clear details about warmup timeout handling, state machine guards, bulletin coordination, and regression tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

src/main.rs

coderabbitai

🧹 Nitpick comments (1)

src/main.rs (1)

1836-1836: Minor: Prefer .ok() over let _ = for channel sends.

Per coding guidelines, channel sends where the receiver may be dropped should use .ok() rather than let _ =.

🔧 Suggested change

-            let _ = locked_tx.send(());
+            locked_tx.send(()).ok();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/main.rs` at line 1836, Replace the discard pattern "let _ =
locked_tx.send(())" with the idiomatic ".ok()" on the send call; specifically
change the call site that invokes locked_tx.send(()) so it becomes
locked_tx.send(()).ok() to explicitly ignore the Result when the receiver may
have been dropped (refer to the locked_tx.send(()) usage).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/main.rs`:
- Line 1836: Replace the discard pattern "let _ = locked_tx.send(())" with the
idiomatic ".ok()" on the send call; specifically change the call site that
invokes locked_tx.send(()) so it becomes locked_tx.send(()).ok() to explicitly
ignore the Result when the receiver may have been dropped (refer to the
locked_tx.send(()) usage).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3563f11 and 2cbe40f.

📒 Files selected for processing (1)

src/main.rs

tembo bot reviewed Feb 27, 2026

View reviewed changes

src/main.rs Show resolved Hide resolved

src/main.rs Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

vsumner added 3 commits February 27, 2026 00:18

fix(cortex): harden startup warmup and bulletin coordination

e9ec529

fix(startup): tighten warmup timeout test and join logging

2bafc16

style(startup): use .ok() for oneshot send in test

b5a050e

vsumner force-pushed the fix/warmup-bulletin-coordination branch from b4248bb to b5a050e Compare February 27, 2026 05:18

Merge branch 'main' into fix/warmup-bulletin-coordination

f05226c

jamiepine merged commit 636d880 into spacedriveapp:main Feb 27, 2026
3 checks passed

This was referenced Feb 28, 2026

Harden warm-recall degraded fallback and cache consistency #257

Open

feat(cortex): Implement Phase 1 ("The Tick Loop") #301

Merged

cortex: implement phase 2 health supervision #305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cortex): harden startup warmup and bulletin coordination#248

fix(cortex): harden startup warmup and bulletin coordination#248
jamiepine merged 4 commits intospacedriveapp:mainfrom
vsumner:fix/warmup-bulletin-coordination

vsumner commented Feb 27, 2026 •

edited by tembo bot

Loading

Uh oh!

coderabbitai bot commented Feb 27, 2026 •

edited

Loading

Review failed

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vsumner commented Feb 27, 2026 • edited by tembo bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Notes (Optional)

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vsumner commented Feb 27, 2026 •

edited by tembo bot

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading