fix(test): eliminate memory::ops flakes under cargo-llvm-cov (#2722)#2737
Conversation
Two root causes:
1. UnifiedMemory had no busy_timeout — concurrent writes from background
ingestion workers hit SQLITE_BUSY immediately (no retry), causing
the tool_memory .expect("put normal") panic under llvm-cov's slower
execution. Fix: set busy_timeout(15s) on Connection::open, matching
the chunks/store.rs pattern.
2. IngestionState.queue_depth AtomicUsize was never reset between tests
— residue from prior tests' background workers inflated queue_depth
assertions. Fix: add reset_for_test() (cfg(test)) that zeroes
queue_depth and clears running state while preserving completion
history; call it at the start of the status snapshot test.
Both changes are covered by regression tests.
Closes tinyhumansai#2722
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR fixes intermittent test flakiness in ChangesTest flakiness fixes for memory::ops
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Code Review — PR #2737fix(test): eliminate memory::ops flakes under cargo-llvm-cov (#2722) OverviewFixes two independent root causes behind intermittent What Passed ✅
Recommendations
Verdict: Approve ✅ — Clean, well-motivated fix with appropriate regression tests and documentation. |
PR Review — fix(test): eliminate memory::ops flakes under cargo-llvm-cov (#2722)Status: ✅ All CI checks passing. What this PR doesFixes two independent root causes of
Code quality
No issues foundThe change is minimal (86 additions, 7 deletions across 4 files), correct, and well-tested. Recommend merge. |
graycyrus
left a comment
There was a problem hiding this comment.
Code looks great — solid fix for the cargo-llvm-cov flakes. The two-pronged approach (SQLite busy-timeout + global state reset) is clean, well-tested, and matches existing patterns in the codebase.
One CI check is failing though: test / Rust Core Tests (Windows — secrets ACL). This looks unrelated to your changes (you're touching memory/ingestion, not secrets), so it might be a transient Windows flake. Either way, once CI is 100% green, i'll approve this. Can you give it a bump or check if that test is flaky on main?
Merge upstream/main to pick up fix(test): eliminate memory::ops flakes under cargo-llvm-cov (tinyhumansai#2722/tinyhumansai#2737) which fixes the pre-existing execute_success_path_persists_rule_in_isolated_workspace test failure. Also add devWorkflow i18n keys to the new Polish (pl) locale chunk added upstream.
Summary
busy_timeout(15s)on theUnifiedMemorySQLite connection ininit.rs, immediately afterConnection::open(), before anyexecute_batch()calls. Without this, concurrent writes from background ingestion workers hitSQLITE_BUSYimmediately (no retry) undercargo-llvm-cov's slower execution — causing thetool_memory .expect("put normal")panic.IngestionState::reset_for_test()(#[cfg(test)]) that zeroesqueue_depthand clears running state while preserving completion history, then calls it at the start ofmemory_ingestion_status_reflects_initialized_client_snapshotto replace the delta-baseline workaround from fix(test): make memory ingestion-status test residue-robust (queue_depth delta) #2721 with a structural fix.connection_has_busy_timeout_set,reset_for_test_clears_queue_depth_and_running_state,reset_for_test_preserves_completion_history.Root cause
Two independent causes behind the flakes seen in #2717 CI runs:
SQLITE_BUSYon writes —UnifiedMemoryhad nobusy_timeoutset, so any concurrent write attempt (background ingestion worker + test write) failed immediately instead of retrying.chunks/store.rsalready uses 15 s;unified/init.rsnow does too.queue_depthresidue —IngestionState.queue_depthis anAtomicUsizeon the process-globalMemoryClientsingleton, shared across allmemory::opstests. Background ingestion workers started by earlier tests outlive theGLOBAL_MEMORY_TEST_LOCKbody, leaving residue that inflated absolutequeue_depthassertions.Test plan
cargo test -p openhuman -- memory::ingestion::state::tests::reset_for_test— 2 passedcargo test -p openhuman -- memory_store::unified::init::tests::connection_has_busy_timeout— 1 passedcargo test -p openhuman -- memory::ops::sync::tests— 6 passedcargo test -p openhuman -- memory::ops::tool_memory::tests— 2 passedcargo fmt --all -- --check— cleancargo check -p openhuman— no new errorsCloses #2722
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests
Documentation