Add Unpack method for Offsets block#7
Conversation
26f8c1f to
aee1b0c
Compare
f2db039 to
a87711d
Compare
📝 WalkthroughWalkthroughAdds a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
frac/block_offsets.go (1)
8-11: Address the TODO comment.The TODO suggests
IDsTotalbelongs in the Info block. Consider whether this field placement is temporary or needs architectural review.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
frac/block_offsets.go(1 hunks)frac/disk_blocks.go(0 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(1 hunks)frac/sealed_loader.go(1 hunks)
💤 Files with no reviewable changes (1)
- frac/disk_blocks.go
🔇 Additional comments (4)
frac/disk_blocks_producer.go (1)
29-34: LGTM! Clean API migration.The method correctly returns the new
BlockOffsetsstruct with properly mapped field names.frac/disk_blocks_writer.go (1)
58-60: LGTM! Proper API update.Method signature and call correctly updated to use the new
BlockOffsetsstruct and itsPackmethod.frac/sealed_loader.go (1)
76-86: LGTM! Excellent refactor.Replacing manual binary parsing with
BlockOffsets.Unpackimproves maintainability and reduces error-prone code duplication.frac/block_offsets.go (1)
13-23: LGTM! Efficient delta encoding implementation.The Pack method correctly implements delta encoding with varint compression for optimal storage efficiency.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7 +/- ##
==========================================
+ Coverage 72.28% 72.30% +0.02%
==========================================
Files 195 196 +1
Lines 17339 17359 +20
==========================================
+ Hits 12534 12552 +18
+ Misses 4120 4119 -1
- Partials 685 688 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a87711d to
d8e9972
Compare
d8e9972 to
c30e124
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
frac/block_offsets.go (2)
17-21: Prefer unsigned varint for deltas to avoid sign conversions and widen rangeUsing signed varints forces int64 conversions and theoretically allows negative deltas; unsigned varints better match non-decreasing offsets and remove sign concerns.
Apply this diff to switch to uvarint encoding/decoding:
func (b *BlockOffsets) Pack(buf []byte) []byte { @@ var prev uint64 for _, pos := range b.Offsets { - buf = binary.AppendVarint(buf, int64(pos-prev)) + buf = binary.AppendUvarint(buf, pos-prev) prev = pos } return buf } @@ b.Offsets = make([]uint64, 0, idsBlocksCount) for len(data) != 0 { - delta, n := binary.Varint(data) - if n == 0 { - return errors.New("blocks offset decoding error: varint returned 0") - } - if n < 0 { - return errors.New("blocks offset decoding error: varint overflow") - } + delta, n := binary.Uvarint(data) + if n == 0 { + return errors.New("blocks offset decoding error: uvarint returned 0") + } + if n < 0 { + return errors.New("blocks offset decoding error: uvarint overflow") + } data = data[n:] - offset += uint64(delta) + offset += delta b.Offsets = append(b.Offsets, offset) }Also applies to: 35-45
1-12: Add targeted tests to close coverage gaps and exercise error pathsCodecov flags missing lines here. Recommend unit tests for:
- Happy path: round-trip Pack → Unpack with several monotonically increasing offsets.
- Truncated headers: len(data)<4 for first and second Uint32.
- Varint decode issues: buffer ends mid-varint (n==0), and overflow case (n<0) if feasible to craft.
- Count mismatch: encode more/less deltas than idsBlocksCount.
I can draft a table-driven Go test (TestBlockOffsetsPackUnpack) that generates encoded buffers and verifies these cases. Want me to open a test PR snippet?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
frac/block_offsets.go(1 hunks)frac/disk_blocks.go(0 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(1 hunks)frac/sealed_loader.go(1 hunks)
💤 Files with no reviewable changes (1)
- frac/disk_blocks.go
🚧 Files skipped from review as they are similar to previous changes (2)
- frac/disk_blocks_producer.go
- frac/sealed_loader.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
frac/disk_blocks_writer.go (2)
frac/block_offsets.go (1)
BlockOffsets(8-11)consts/consts.go (1)
IDsBlockSize(14-14)
🔇 Additional comments (2)
frac/block_offsets.go (1)
46-49: Count mismatch validation looks goodGood catch adding a strict count check to prevent silent corruption when trailing bytes exist or the stream is malformed.
frac/disk_blocks_writer.go (1)
58-61: LGTM: writer correctly switched to BlockOffsets.PackSignature update and Pack invocation look consistent with the new API; buffer sizing via IDsBlockSize remains intact.
c30e124 to
adb8a67
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
frac/disk_blocks_writer.go (2)
58-61: Clarify “positions” vs “offsets” naming to avoid confusion.If “positions” is an on-disk contract, keeping the block name makes sense; otherwise consider aligning names or add a brief comment for future readers.
Apply this minimal doc comment to clarify intent:
- func (w *DiskBlocksWriter) writePositionsBlock(zstdCompressLevel int, block *BlockOffsets) error { + // Serializes BlockOffsets (count, IDsTotal, varint-encoded deltas) into the + // "positions" block in the registry (name kept for backward compatibility). + func (w *DiskBlocksWriter) writePositionsBlock(zstdCompressLevel int, block *BlockOffsets) error {
60-60: Confirm prealloc size choice for offsets block.Using
consts.IDsBlockSizeto pre-size the buffer is likely fine, but please confirm it’s intentional for offsets (not IDs). If this block is typically small, you might considerRegularBlockSizeor a smaller initial capacity to reduce transient allocations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
frac/block_offsets.go(1 hunks)frac/disk_blocks.go(0 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(1 hunks)frac/sealed_loader.go(1 hunks)
💤 Files with no reviewable changes (1)
- frac/disk_blocks.go
🚧 Files skipped from review as they are similar to previous changes (3)
- frac/sealed_loader.go
- frac/block_offsets.go
- frac/disk_blocks_producer.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
frac/disk_blocks_writer.go (2)
frac/block_offsets.go (1)
BlockOffsets(8-11)consts/consts.go (1)
IDsBlockSize(14-14)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (1)
frac/disk_blocks_writer.go (1)
58-61: Switch to BlockOffsets and exported Pack looks correct.Signature change and
block.Pack(...)usage align with the new type; compression, error handling, and stats remain intact.
adb8a67 to
8b097dd
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
frac/disk_blocks_writer.go (2)
58-58: Add a nil guard for block to avoid a panic on misuse.Defensive check prevents a nil deref if callers miswire the pipeline.
Apply this diff:
func (w *DiskBlocksWriter) writePositionsBlock(zstdCompressLevel int, block *BlockOffsets) error { - now := time.Now() + if block == nil { + return errors.New("positions block is nil") + } + now := time.Now()Also add the import:
// add to the existing imports import ( "errors" )
58-58: Unify naming: zstdCompressLevel vs zstdLevel across writer methods.This file mixes parameter names (see writeIDsBlocks vs others). Pick one (suggest: zstdCompressLevel) for consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
frac/block_offsets.go(1 hunks)frac/disk_blocks.go(0 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(1 hunks)frac/sealed_loader.go(1 hunks)
💤 Files with no reviewable changes (1)
- frac/disk_blocks.go
🚧 Files skipped from review as they are similar to previous changes (3)
- frac/block_offsets.go
- frac/disk_blocks_producer.go
- frac/sealed_loader.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
frac/disk_blocks_writer.go (1)
frac/block_offsets.go (1)
BlockOffsets(8-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (2)
frac/disk_blocks_writer.go (2)
58-61: Switch to BlockOffsets.Pack looks correct; I/O path unchanged.The new signature and serialization call align with the BlockOffsets API; the write path (compression, stats) remains intact.
58-61: Sanity check passed—no stale references found
- No occurrences of
DiskPositionsBlockremain.- All calls to
writePositionsBlockare updated (in disk_blocks_writer.go and active_sealer.go).- No lingering
packmethods on the old positions/offsets types.
8b097dd to
49fe427
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
frac/disk_blocks_writer.go (2)
59-61: Add a nil-guard for block to avoid a panic path.Cheap safety check; returns early instead of dereferencing a nil pointer.
now := time.Now() + if block == nil { + return errors.New("writePositionsBlock: nil BlockOffsets") + } w.buf = block.Pack(w.resetBuf(consts.IDsBlockSize))Add import:
// at the top import block "errors"
58-61: BlockOffsets.Pack integration verified
- Writer still emits the
"positions"block at frac/disk_blocks_writer.go:61.- All on-disk readers now invoke BlockOffsets.Unpack (e.g. sealed_loader.go:81); there’s no remaining
DiskPositionsBlockor old packer.- There’s no format/version marker in BlockOffsets.Pack/Unpack.
If you don’t need to load legacy shards, this change is safe as-is. Otherwise, introduce a version or magic prefix in the positions block to distinguish formats.
Optionally, align the on-disk block name (currently"positions") with theBlockOffsetstype for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
frac/block_offsets.go(1 hunks)frac/disk_blocks.go(0 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(1 hunks)frac/sealed_loader.go(1 hunks)
💤 Files with no reviewable changes (1)
- frac/disk_blocks.go
🚧 Files skipped from review as they are similar to previous changes (3)
- frac/disk_blocks_producer.go
- frac/sealed_loader.go
- frac/block_offsets.go
🧰 Additional context used
🧬 Code graph analysis (1)
frac/disk_blocks_writer.go (2)
frac/block_offsets.go (1)
BlockOffsets(8-11)consts/consts.go (1)
IDsBlockSize(14-14)
Summary by CodeRabbit
Refactor
Chores