Skip to content

feat: crash recovery and graceful shutdown (story 1.9)#11

Merged
vieiralucas merged 3 commits intomainfrom
feat/1-9-crash-recovery-graceful-shutdown
Feb 12, 2026
Merged

feat: crash recovery and graceful shutdown (story 1.9)#11
vieiralucas merged 3 commits intomainfrom
feat/1-9-crash-recovery-graceful-shutdown

Conversation

@vieiralucas
Copy link
Copy Markdown
Member

@vieiralucas vieiralucas commented Feb 12, 2026

Summary

  • Implement crash recovery that scans for expired leases on startup and reclaims them so messages re-enter the ready pool
  • Add WAL flush on graceful shutdown to ensure all writes are durable before exit
  • Add parse_lease_expiry_key() for extracting queue_id and msg_id from expiry keys
  • Add flush() method to Storage trait, implemented via flush_wal(true) in RocksDB

Test plan

  • recovery_preserves_messages_after_restart — enqueue 5 messages, restart scheduler, verify all delivered
  • recovery_reclaims_expired_leases — create expired lease, restart, verify reclaimed and message delivered to consumer
  • recovery_preserves_queue_definitions — create 3 queues, restart, verify all present
  • shutdown_flushes_wal — enqueue, shutdown, reopen storage from disk, verify data survived
  • 51 tests pass, clippy clean, fmt clean

Recreated after GitHub auto-closed the original PR #9


Summary by cubic

Adds crash recovery and graceful shutdown so the broker survives restarts without message loss. On startup we reclaim expired leases and return messages to the ready pool; on shutdown we flush the RocksDB WAL for durability. (Linear story 1.9)

  • New Features
    • Recovery: run at scheduler start; scan lease_expiry up to now; atomically delete expired lease and expiry; return messages to ready; skip corrupt expiry keys; preserve non-expired leases; log reclaimed and queue_count.
    • Shutdown: add Storage::flush (RocksDB flush_wal(true)); call on exit.
    • Utilities: add parse_lease_expiry_key.
    • Tests: verify message persistence across restart, expired lease reclamation and delivery, queue definitions survival, WAL flush durability, parse_lease_expiry_key roundtrip and corrupt input rejection, and that recovery skips corrupt keys and keeps active leases.

Written for commit 7cd3970. Summary will update on new commits.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 6 files

@vieiralucas vieiralucas force-pushed the feat/1-8-message-acknowledgment branch from 69e45f6 to c00fee8 Compare February 12, 2026 01:38
@vieiralucas vieiralucas force-pushed the feat/1-9-crash-recovery-graceful-shutdown branch from 03b03ce to e7f66bf Compare February 12, 2026 01:38
@vieiralucas vieiralucas force-pushed the feat/1-8-message-acknowledgment branch from c00fee8 to e7a8f09 Compare February 12, 2026 01:43
@vieiralucas vieiralucas force-pushed the feat/1-9-crash-recovery-graceful-shutdown branch 2 times, most recently from 581bc40 to 8d2c8f9 Compare February 12, 2026 01:45
Base automatically changed from feat/1-8-message-acknowledgment to main February 12, 2026 01:48
add recovery logic that scans for expired leases on startup and reclaims
them so messages re-enter the ready pool. flush the rocksdb wal on
graceful shutdown to ensure all writes are durable. includes integration
tests for message persistence across restarts, expired lease reclamation,
queue definition survival, and wal flush verification.
@vieiralucas vieiralucas force-pushed the feat/1-9-crash-recovery-graceful-shutdown branch from 8d2c8f9 to 30cea52 Compare February 12, 2026 01:48
- parse_lease_expiry_key roundtrip and corrupt input rejection
- recovery skips corrupt lease_expiry keys without panicking
- recovery preserves non-expired leases (active messages not reclaimed)
@vieiralucas vieiralucas merged commit cdb55bd into main Feb 12, 2026
5 checks passed
@vieiralucas vieiralucas deleted the feat/1-9-crash-recovery-graceful-shutdown branch February 12, 2026 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant