Skip to content

feat: add partition-aware storage trait abstraction#49

Closed
vieiralucas wants to merge 5 commits intomainfrom
feat/13.1-storage-trait-rocksdb-adapter
Closed

feat: add partition-aware storage trait abstraction#49
vieiralucas wants to merge 5 commits intomainfrom
feat/13.1-storage-trait-rocksdb-adapter

Conversation

@vieiralucas
Copy link
Copy Markdown
Member

@vieiralucas vieiralucas commented Mar 8, 2026

Summary

  • Adds PartitionId newtype (u32 with DEFAULT constant) for partition-namespace awareness in preparation for multi-partition clustering (Epic 14)
  • Evolves Storage trait: all methods now accept partition: &PartitionId parameter (except flush())
  • Renames StorageError::RocksDbStorageError::Backend for backend-agnosticism
  • Updates all scheduler, handler, recovery, delivery, and Lua engine call sites to pass PartitionId::DEFAULT
  • 294 tests pass, clippy clean, 21 files changed

Design Decisions

  • Partition at write_batch() call site, not per WriteBatchOp variant — cleaner API
  • keys.rs not updated with partition parameter — partition handled at Storage trait level (key namespacing deferred to Story 13.2+ when custom storage engine is built)
  • delivery.rs uses &self.partition direct field access instead of self.p() to avoid borrow checker conflict with self.consumer_rr_idx

Test plan

  • cargo test --workspace — 294 tests pass
  • cargo clippy --workspace --all-targets — clean
  • CI passes
  • Cubic review findings addressed

🤖 Generated with Claude Code


Summary by cubic

Introduces partition-aware storage by adding PartitionId and updating the Storage trait to require a partition for all operations. RocksDB, scheduler, and Lua run on the default partition so behavior stays the same, satisfying Story 13.1 (Epic 13) and preparing for multi-partition clustering.

  • New Features

    • Added PartitionId(u32) with DEFAULT; all Storage methods now take &PartitionId (except flush()).
    • Updated RocksDB adapter and all scheduler/Lua paths to pass the default partition; existing key encoding preserved.
    • Renamed StorageError::RocksDb to StorageError::Backend.
    • Partition passed at write_batch() call site (no per-variant partition).
  • Migration

    • Update Storage implementors to new method signatures.
    • Pass PartitionId::DEFAULT at call sites (scheduler, Lua, tests).
    • LuaEngine::new() now takes a PartitionId; pass PartitionId::DEFAULT.
    • Replace matches on StorageError::RocksDb with StorageError::Backend.

Written for commit f3c67ae. Summary will update on new commits.

Add PartitionId type with DEFAULT constant for single-node mode.
All Storage trait methods now accept a partition parameter, enabling
future multi-partition clustering. Rename StorageError::RocksDb to
Backend for backend-agnosticism. Update all scheduler call sites,
Lua bridge, and tests to pass PartitionId::DEFAULT.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 24 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="_bmad-output/implementation-artifacts/13-1-storage-trait-abstraction-rocksdb-adapter.md">

<violation number="1" location="_bmad-output/implementation-artifacts/13-1-storage-trait-abstraction-rocksdb-adapter.md:31">
P2: Acceptance Criteria #2 and #5 contradict the actual implementation described in the Tasks and Completion Notes sections of this same document. AC #2 says `WriteBatchOp` is "extended to carry partition context" (it wasn't—partition is passed at `write_batch()` call-site instead), and AC #5 says `keys.rs` functions accept a partition parameter (Task 3 explicitly marks this as a design deviation that was *not* done). Update the ACs to reflect the actual design decisions, or add clear "Not Met / Deferred" annotations, so someone checking story completion against ACs isn't misled.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@@ -0,0 +1,229 @@
# Story 13.1: Storage Trait Abstraction & RocksDB Adapter
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Acceptance Criteria #2 and #5 contradict the actual implementation described in the Tasks and Completion Notes sections of this same document. AC #2 says WriteBatchOp is "extended to carry partition context" (it wasn't—partition is passed at write_batch() call-site instead), and AC #5 says keys.rs functions accept a partition parameter (Task 3 explicitly marks this as a design deviation that was not done). Update the ACs to reflect the actual design decisions, or add clear "Not Met / Deferred" annotations, so someone checking story completion against ACs isn't misled.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At _bmad-output/implementation-artifacts/13-1-storage-trait-abstraction-rocksdb-adapter.md, line 31:

<comment>Acceptance Criteria #2 and #5 contradict the actual implementation described in the Tasks and Completion Notes sections of this same document. AC #2 says `WriteBatchOp` is "extended to carry partition context" (it wasn't—partition is passed at `write_batch()` call-site instead), and AC #5 says `keys.rs` functions accept a partition parameter (Task 3 explicitly marks this as a design deviation that was *not* done). Update the ACs to reflect the actual design decisions, or add clear "Not Met / Deferred" annotations, so someone checking story completion against ACs isn't misled.</comment>

<file context>
@@ -0,0 +1,229 @@
+   **And** a `PartitionId` type is defined with a `DEFAULT` constant for single-partition mode
+   **And** the default partition preserves current key encoding behavior exactly
+
+2. **Given** the existing `RocksDbStorage` at `crates/fila-core/src/storage/rocksdb.rs`
+   **When** it implements the partition-aware `Storage` trait
+   **Then** the default partition maps to the current column family key encoding unchanged
</file context>
Fix with Cubic

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Benchmark Results (median of 3 runs)

Commit: cba40a5
Time: 2026-03-08T23:12:22Z

Benchmark Value Unit
compaction_active_p99 0.48054 ms
compaction_idle_p99 0.472341 ms
compaction_p99_delta 0.023297000000000012 ms
consumer_concurrency_100_throughput 2939.6666666666665 msg/s
consumer_concurrency_10_throughput 2910.6666666666665 msg/s
consumer_concurrency_1_throughput 693.3333333333334 msg/s
e2e_latency_p50_light 0.398831 ms
e2e_latency_p95_light 0.475199 ms
e2e_latency_p99_light 0.590912 ms
enqueue_throughput_1kb 2718.453121300176 msg/s
enqueue_throughput_1kb_mbps 2.654739376269703 MB/s
fairness_accuracy_max_deviation 0.1999999999999988 % deviation
fairness_accuracy_tenant-1 0.1999999999999988 % deviation
fairness_accuracy_tenant-2 0.1999999999999988 % deviation
fairness_accuracy_tenant-3 0.099999999999989 % deviation
fairness_accuracy_tenant-4 0.099999999999989 % deviation
fairness_accuracy_tenant-5 0.099999999999989 % deviation
fairness_overhead_fair_throughput 1417.4531424702536 msg/s
fairness_overhead_fifo_throughput 1458.1580986977697 msg/s
fairness_overhead_pct 2.5272233999782734 %
key_cardinality_10_throughput 2477.4644428057013 msg/s
key_cardinality_10k_throughput 1273.156603070177 msg/s
key_cardinality_1k_throughput 2116.200867623824 msg/s
lua_on_enqueue_overhead_us 13.744463270292728 us
lua_throughput_with_hook 1197.8060077309433 msg/s
memory_per_message_overhead 2925.3632 bytes/msg
memory_rss_idle 162.89453125 MB
memory_rss_loaded_10k 190.54296875 MB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Benchmark Results (median of 3 runs)

Commit: 6e94e27
Time: 2026-03-08T23:18:30Z

Benchmark Value Unit
compaction_active_p99 0.408584 ms
compaction_idle_p99 0.419512 ms
compaction_p99_delta 0.004607000000000028 ms
consumer_concurrency_100_throughput 2947.0 msg/s
consumer_concurrency_10_throughput 2997.6666666666665 msg/s
consumer_concurrency_1_throughput 702.0 msg/s
e2e_latency_p50_light 0.328944 ms
e2e_latency_p95_light 0.385648 ms
e2e_latency_p99_light 0.578209 ms
enqueue_throughput_1kb 3284.341346087654 msg/s
enqueue_throughput_1kb_mbps 3.2073645957887247 MB/s
fairness_accuracy_max_deviation 0.1999999999999988 % deviation
fairness_accuracy_tenant-1 0.1999999999999988 % deviation
fairness_accuracy_tenant-2 0.1999999999999988 % deviation
fairness_accuracy_tenant-3 0.099999999999989 % deviation
fairness_accuracy_tenant-4 0.099999999999989 % deviation
fairness_accuracy_tenant-5 0.099999999999989 % deviation
fairness_overhead_fair_throughput 1326.0058116658597 msg/s
fairness_overhead_fifo_throughput 1421.5956982400842 msg/s
fairness_overhead_pct 3.3381739929912335 %
key_cardinality_10_throughput 3077.596115853307 msg/s
key_cardinality_10k_throughput 1279.2997975993765 msg/s
key_cardinality_1k_throughput 2211.518031276957 msg/s
lua_on_enqueue_overhead_us 15.030735932573291 us
lua_throughput_with_hook 1141.9643269377002 msg/s
memory_per_message_overhead 978.1248 bytes/msg
memory_rss_idle 177.6015625 MB
memory_rss_loaded_10k 187.26171875 MB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Benchmark Results (median of 3 runs)

Commit: 79dd327
Time: 2026-03-08T23:26:10Z

Benchmark Value Unit
compaction_active_p99 0.421759 ms
compaction_idle_p99 0.428031 ms
compaction_p99_delta -0.015978999999999965 ms
consumer_concurrency_100_throughput 3076.0 msg/s
consumer_concurrency_10_throughput 3100.6666666666665 msg/s
consumer_concurrency_1_throughput 677.6666666666666 msg/s
e2e_latency_p50_light 0.332315 ms
e2e_latency_p95_light 0.391174 ms
e2e_latency_p99_light 0.481739 ms
enqueue_throughput_1kb 3334.1601326055384 msg/s
enqueue_throughput_1kb_mbps 3.256015754497596 MB/s
fairness_accuracy_max_deviation 0.1999999999999988 % deviation
fairness_accuracy_tenant-1 0.1999999999999988 % deviation
fairness_accuracy_tenant-2 0.1999999999999988 % deviation
fairness_accuracy_tenant-3 0.099999999999989 % deviation
fairness_accuracy_tenant-4 0.099999999999989 % deviation
fairness_accuracy_tenant-5 0.099999999999989 % deviation
fairness_overhead_fair_throughput 1366.7318144164478 msg/s
fairness_overhead_fifo_throughput 1396.8726815738137 msg/s
fairness_overhead_pct 2.742628804071179 %
key_cardinality_10_throughput 3036.7031946542234 msg/s
key_cardinality_10k_throughput 1238.801263655543 msg/s
key_cardinality_1k_throughput 2146.3256351787218 msg/s
lua_on_enqueue_overhead_us 24.102876374654556 us
lua_throughput_with_hook 1116.992213074952 msg/s
memory_per_message_overhead 0.0 bytes/msg
memory_rss_idle 189.0703125 MB
memory_rss_loaded_10k 187.7734375 MB

@vieiralucas
Copy link
Copy Markdown
Member Author

Closing: Epic 13 reshaped per clustering architecture research. The Raft-per-queue model (see _bmad/docs/research/decoupled-scheduler-sharded-storage.md) makes the purpose-built storage engine redundant — RocksDB serves as the Raft state machine backend. Epic 13 is being replaced with a 2-story storage trait cleanup + clustering prep epic.

@vieiralucas vieiralucas deleted the feat/13.1-storage-trait-rocksdb-adapter branch March 11, 2026 02:02
vieiralucas added a commit that referenced this pull request Mar 11, 2026
…r-queue model

technical research validated cockroachdb-style single-binary raft-per-queue
clustering architecture. this makes the purpose-built storage engine (epic 13)
redundant and the partition-based clustering (epic 14) incompatible.

changes:
- close prs #49-53 (epic 13 storage engine, none merged)
- reshape epic 13: 5 stories → 2 (storage trait cleanup + phase 2 viability seams)
- replan epic 14: partition-based → raft-per-queue (5 stories redesigned)
- defer fr66 (custom storage engine) and nfr30-33 to post-clustering
- update prd, architecture doc, epics, sprint-status
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant