Make separate mids, rids and params blocks#8
Conversation
cfd2f6e to
2f4c4dc
Compare
36086c7 to
2fb657c
Compare
2fb657c to
dca4e31
Compare
WalkthroughThis set of changes refactors the handling of sequence ID blocks in the codebase. It replaces the old Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Sealed
participant seqids.Provider
participant seqids.Loader
participant disk.IndexReader
participant Cache
Client->>Sealed: Request MID/RID/DocPos for LID
Sealed->>seqids.Provider: MID/RID/DocPos(LID)
seqids.Provider->>Cache: Check block in cache
alt Not in cache
seqids.Provider->>seqids.Loader: Load block (MIDs/RIDs/Params)
seqids.Loader->>disk.IndexReader: Read block from disk
seqids.Loader->>Cache: Store block in cache
end
seqids.Provider->>Sealed: Return MID/RID/DocPos
Sealed->>Client: Return result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8 +/- ##
==========================================
- Coverage 73.70% 73.53% -0.17%
==========================================
Files 191 192 +1
Lines 15807 15873 +66
==========================================
+ Hits 11651 11673 +22
- Misses 3582 3611 +29
- Partials 574 589 +15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
frac/sealed/seqids/loader.go (1)
82-104: Consider consistent error checking patternThe empty block check occurs after unpacking, while in
GetMIDsBlockandGetRIDsBlockit happens before unpacking. Consider moving the empty check before unpacking for consistency, or document why this method needs different behavior.func (l *Loader) GetParamsBlock(index uint32) (BlockParams, error) { // load binary from index block, err := l.cacheParams.GetWithError(index, func() (BlockParams, int, error) { data, _, err := l.reader.ReadIndexBlock(l.paramsBlockIndex(index), nil) if err != nil { return BlockParams{}, 0, err } + // check for empty data before unpacking + if len(data) == 0 { + return BlockParams{}, 0, errors.New("empty block") + } // unpack block := BlockParams{Values: make([]uint64, 0, consts.IDsPerBlock)} if err = block.Unpack(data); err != nil { return BlockParams{}, 0, err } - if len(block.Values) == 0 { - return BlockParams{}, 0, errors.New("empty block") - } return block, cap(block.Values) * 8, nil })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
frac/active_sealer.go(1 hunks)frac/disk_blocks.go(1 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(2 hunks)frac/index_cache.go(1 hunks)frac/sealed.go(3 hunks)frac/sealed/seqids/blocks.go(1 hunks)frac/sealed/seqids/loader.go(1 hunks)frac/sealed/seqids/provider.go(1 hunks)frac/sealed/seqids/sealed_ids.go(0 hunks)frac/sealed/seqids/sealed_ids_test.go(0 hunks)frac/sealed/seqids/unpack_cache.go(1 hunks)frac/sealed_index.go(8 hunks)frac/sealed_loader.go(2 hunks)fracmanager/cache_maintainer.go(2 hunks)
💤 Files with no reviewable changes (2)
- frac/sealed/seqids/sealed_ids_test.go
- frac/sealed/seqids/sealed_ids.go
🧰 Additional context used
🧬 Code Graph Analysis (2)
frac/sealed/seqids/loader.go (1)
cmd/distribution/main.go (1)
buildDist(87-143)
frac/sealed_index.go (1)
cmd/distribution/main.go (1)
buildDist(87-143)
🪛 GitHub Check: lint
frac/sealed/seqids/loader.go
[failure] 91-91:
sloppyReassign: re-assignment to err can be replaced with err := block.Unpack(data) (gocritic)
[failure] 76-76:
sloppyReassign: re-assignment to err can be replaced with err := block.Unpack(data) (gocritic)
[failure] 52-52:
sloppyReassign: re-assignment to err can be replaced with err := block.Unpack(data) (gocritic)
frac/sealed_index.go
[failure] 335-335:
importShadow: shadow of imported from 'github.com/ozontech/seq-db/frac/sealed/lids' package 'lids' (gocritic)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (18)
frac/index_cache.go (1)
6-6: LGTM! Type refactoring aligns with the broader seqids package restructuring.The addition of the seqids import and the change from
[]uint64toseqids.BlockParamsfor the Params field provides better type safety and encapsulation. This change is consistent with the overall refactoring that introduces more granular block types.Also applies to: 14-14
fracmanager/cache_maintainer.go (1)
14-14: LGTM! Cache type update maintains consistency with the seqids refactoring.The import addition and the type change from
[]uint64toseqids.BlockParamsin theCreateIndexCachemethod ensures that the cache maintainer creates caches compatible with the new block types introduced in the seqids package.Also applies to: 150-150
frac/active_sealer.go (1)
210-215: LGTM! Type and field name updates align with the seqids refactoring.The change from
seqids.IDsTabletoseqids.Tableand the field name update fromDiskStartBlockIndextoStartBlockIndexare consistent with the broader refactoring. The simplified field name is more concise while maintaining clarity.frac/sealed_loader.go (1)
75-75: LGTM! Method signature and field name updates maintain consistency.The change from
seqids.IDsTabletoseqids.Tablein the return type and the field name update fromDiskStartBlockIndextoStartBlockIndexare consistent with the broader refactoring. The core logic for reading and decoding the index block remains unchanged and correct.Also applies to: 102-102
frac/disk_blocks.go (3)
7-9: LGTM! Import additions support the new idsBlock functionality.The addition of seqids and seq package imports is necessary to support the new idsBlock struct and its methods.
12-16: LGTM! New idsBlock struct provides unified container for block types.The idsBlock struct effectively consolidates the three separate block types (BlockMIDs, BlockRIDs, BlockParams) into a single container, which simplifies block generation and writing logic as mentioned in the AI summary.
26-29: LGTM! GetExtForRegistry method correctly extracts registry extensions.The method properly delegates to
GetMinID()and extracts the MID and RID components as uint64 values for registry use.frac/disk_blocks_writer.go (1)
77-104: LGTM! Clean refactoring to use the new block structure.The changes correctly adapt to the new
idsBlockstructure by accessing individual fields (mids,rids,params) and calling their respectivePackmethods. The control flow and error handling remain intact.frac/sealed.go (2)
41-41: Type rename looks good.The change from
seqids.IDsTabletoseqids.Tableis consistent across both theSealedandPreloadedDatastructs.Also applies to: 126-126
364-372: Good consolidation into the Provider pattern.The refactoring nicely consolidates the previous separate caches and loader into a single
seqids.Providerabstraction. This simplifies the data provider creation and better encapsulates the ID management logic.frac/disk_blocks_producer.go (2)
55-68: Verify that the push callback doesn't store the block pointer.The refactoring to reuse a single
blockvariable is efficient, but there's a potential issue: if thepushcallback stores the pointer instead of copying the block's data, all stored blocks would reference the same object with the last iteration's data.Please ensure that the
pushimplementation inwriteIDsBlocks(fromdisk_blocks_writer.go) properly extracts and stores the data rather than keeping references to the block.
71-80: Clean implementation of fillIDsBlock.The method correctly:
- Clears existing values by reslicing to 0
- Extracts MID and RID components from seq.ID
- Retrieves positions from DocsPositions
- Populates the respective value slices
frac/sealed/seqids/unpack_cache.go (1)
9-13: Good encapsulation and appropriate type sizing.The changes improve the design:
- Making the struct unexported provides better encapsulation
- Using
intfor blockIndex anduint32for LID is more appropriate than the previous types- Simplifying to just cache functionality aligns with the separation of concerns
Also applies to: 43-45
frac/sealed/seqids/blocks.go (2)
10-30: Well-implemented delta encoding for MIDs.The
BlockMIDsimplementation correctly uses varint delta encoding, which is efficient for sequential message IDs. Good error handling propagation from the unpack helper.
79-101: Helper functions are well-implemented.Good error handling in
unpackRawIDsVarint. Consider adding the same length validation tounpackRawIDsNoVarintthat was suggested forBlockRIDs.Unpack.frac/sealed/seqids/loader.go (1)
3-11: Fix import orderingThe imports should be grouped and ordered according to Go conventions: standard library, then external packages, then internal packages.
import ( "errors" "github.com/ozontech/seq-db/cache" "github.com/ozontech/seq-db/conf" "github.com/ozontech/seq-db/consts" "github.com/ozontech/seq-db/disk" "github.com/ozontech/seq-db/seq" )Likely an incorrect or invalid review comment.
frac/sealed/seqids/provider.go (1)
98-112: Well-documented memory safety considerationThe comment clearly explains why copying
block.Valuesis necessary to prevent data corruption from sync.Pool reuse. The implementation correctly usesappendto create a new backing array.frac/sealed_index.go (1)
158-180: Consider propagating errors instead of panickingThe new implementation panics on errors in
GetMID,GetRID, anddocPosmethods. While the detailed logging context is helpful, panicking should be reserved for truly unrecoverable situations. Consider whether these errors could be propagated to callers for more graceful handling.If panicking is indeed the desired behavior due to the critical nature of these operations, please document why error propagation is not feasible in this context.
dca4e31 to
26f8c1f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
frac/sealed/seqids/blocks.go (1)
32-55: Add validation for fixed-size unpacking.The
BlockRIDsimplementation handles version compatibility well, but the fixed-size unpacking should validate the input data length to prevent potential panics.Add validation to ensure data length is a multiple of 8:
func (b *BlockRIDs) Unpack(data []byte) error { if b.fracVersion < conf.BinaryDataV1 { values, err := unpackRawIDsVarint(data, b.Values) if err != nil { return err } b.Values = values return nil } + if len(data)%8 != 0 { + return errors.New("invalid data length for fixed-size unpacking") + } b.Values = unpackRawIDsNoVarint(data, b.Values) return nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
frac/active_sealer.go(1 hunks)frac/disk_blocks.go(1 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(2 hunks)frac/index_cache.go(1 hunks)frac/sealed.go(3 hunks)frac/sealed/seqids/blocks.go(1 hunks)frac/sealed/seqids/loader.go(1 hunks)frac/sealed/seqids/provider.go(1 hunks)frac/sealed/seqids/sealed_ids.go(0 hunks)frac/sealed/seqids/sealed_ids_test.go(0 hunks)frac/sealed/seqids/unpack_cache.go(1 hunks)frac/sealed_index.go(8 hunks)frac/sealed_loader.go(2 hunks)fracmanager/cache_maintainer.go(2 hunks)
💤 Files with no reviewable changes (2)
- frac/sealed/seqids/sealed_ids_test.go
- frac/sealed/seqids/sealed_ids.go
🚧 Files skipped from review as they are similar to previous changes (10)
- frac/index_cache.go
- fracmanager/cache_maintainer.go
- frac/active_sealer.go
- frac/sealed.go
- frac/disk_blocks_writer.go
- frac/sealed_loader.go
- frac/sealed/seqids/loader.go
- frac/disk_blocks.go
- frac/sealed/seqids/provider.go
- frac/sealed_index.go
🧰 Additional context used
🧬 Code Graph Analysis (3)
frac/disk_blocks_producer.go (1)
seq/seq.go (3)
ID(12-15)MID(17-17)RID(18-18)
frac/sealed/seqids/blocks.go (1)
conf/frac_version.go (2)
BinaryDataVersion(3-3)BinaryDataV1(9-9)
frac/sealed/seqids/unpack_cache.go (2)
consts/consts.go (1)
IDsPerBlock(14-14)frac/sealed/seqids/sealed_ids_test.go (1)
TestUnpackRIDs(19-53)
🔇 Additional comments (9)
frac/sealed/seqids/unpack_cache.go (4)
9-13: LGTM! Struct simplification aligns with new architecture.The changes to make the struct unexported and simplify field types are appropriate for the new design where unpacking logic is handled by dedicated block types.
15-15: LGTM! Pool variable rename is consistent.The rename from
unpackCachePooltocachePoolis cleaner and reflects the simplified purpose of the cache.
19-30: LGTM! Constructor simplification is appropriate.The function rename to
NewCacheand return type change to*unpackCacheare consistent with the struct changes and new architecture.
43-45: LGTM! Method signature updated correctly.The parameter type change from
uint64touint32forlidis consistent with thestartLIDfield type change and appears appropriate for the use case.frac/disk_blocks_producer.go (2)
55-69: LGTM! Method refactoring is clean and correct.The changes to use the local
idsBlocktype and the newfillIDsBlockmethod are well-structured and align with the broader architectural refactoring.
71-80: LGTM! Data extraction logic is correct.The
fillIDsBlockmethod correctly extracts MID, RID, and position data from theseq.IDslices and populates the three separate block fields. The slice reset pattern using[:0]is efficient for reusing underlying arrays.frac/sealed/seqids/blocks.go (3)
10-30: LGTM! BlockMIDs implementation is correct.The delta-varint encoding/decoding implementation is correct and includes proper error handling. The pack method efficiently encodes deltas, and the unpack method properly handles errors from the varint decoding.
57-77: LGTM! BlockParams implementation is correct.The delta-varint encoding/decoding implementation mirrors BlockMIDs correctly and includes proper error handling for varint decoding failures.
79-92: LGTM! Varint unpacking helper is robust.The
unpackRawIDsVarintfunction correctly handles varint decoding with proper error handling for invalid varint data. The delta reconstruction logic is correct.
26f8c1f to
aee1b0c
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
frac/sealed/seqids/loader.go (1)
82-104: Consider making GetParamsBlock consistent with other block methods.The
GetParamsBlockmethod differs fromGetMIDsBlockandGetRIDsBlockin that it doesn't accept a buffer parameter. This inconsistency could lead to confusion and prevents buffer reuse optimization for params blocks.Consider updating the method signature to match the pattern:
-func (l *Loader) GetParamsBlock(index uint32) (BlockParams, error) { +func (l *Loader) GetParamsBlock(index uint32, buf []uint64) (BlockParams, error) {And update the block creation to reuse the buffer:
- block := BlockParams{Values: make([]uint64, 0, consts.IDsPerBlock)} + block := BlockParams{Values: buf}frac/sealed/seqids/provider.go (3)
56-67: Use consistent table reference.The method accesses
p.loader.tableon line 63, but the Provider has its owntablefield that points to the same instance. For consistency and clarity, usep.tableinstead.- dst.startLID = p.loader.table.BlockStartLID(blockIndex) + dst.startLID = p.table.BlockStartLID(blockIndex)
77-88: Use consistent table reference.Similar to fillMIDs, this method should use
p.tableinstead ofp.loader.tablefor consistency.- dst.startLID = p.loader.table.BlockStartLID(blockIndex) + dst.startLID = p.table.BlockStartLID(blockIndex)
98-112: Use consistent table reference and good defensive copying.The method correctly copies the block values to prevent corruption when the unpackCache releases values to sync.Pool. However, it should use
p.tableinstead ofp.loader.tablefor consistency.- dst.startLID = p.loader.table.BlockStartLID(blockIndex) + dst.startLID = p.table.BlockStartLID(blockIndex)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
frac/active_sealer.go(1 hunks)frac/disk_blocks.go(1 hunks)frac/disk_blocks_producer.go(1 hunks)frac/disk_blocks_writer.go(2 hunks)frac/index_cache.go(1 hunks)frac/sealed.go(3 hunks)frac/sealed/seqids/blocks.go(1 hunks)frac/sealed/seqids/loader.go(1 hunks)frac/sealed/seqids/provider.go(1 hunks)frac/sealed/seqids/sealed_ids.go(0 hunks)frac/sealed/seqids/sealed_ids_test.go(0 hunks)frac/sealed/seqids/unpack_cache.go(1 hunks)frac/sealed_index.go(8 hunks)frac/sealed_loader.go(2 hunks)fracmanager/cache_maintainer.go(2 hunks)
💤 Files with no reviewable changes (2)
- frac/sealed/seqids/sealed_ids_test.go
- frac/sealed/seqids/sealed_ids.go
🚧 Files skipped from review as they are similar to previous changes (11)
- frac/index_cache.go
- frac/sealed_loader.go
- fracmanager/cache_maintainer.go
- frac/disk_blocks_writer.go
- frac/disk_blocks.go
- frac/active_sealer.go
- frac/disk_blocks_producer.go
- frac/sealed/seqids/unpack_cache.go
- frac/sealed/seqids/blocks.go
- frac/sealed.go
- frac/sealed_index.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
frac/sealed/seqids/loader.go (6)
consts/consts.go (1)
IDsPerBlock(14-14)disk/index_reader.go (1)
IndexReader(14-18)cache/cache.go (1)
Cache(60-68)frac/sealed/seqids/blocks.go (3)
BlockParams(57-59)BlockMIDs(10-12)BlockRIDs(32-35)conf/frac_version.go (1)
BinaryDataVersion(3-3)frac/sealed/seqids/sealed_ids.go (1)
NewIDsLoader(30-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (1)
frac/sealed/seqids/loader.go (1)
13-26: LGTM!The
Tablestruct and its methods correctly implement block index calculations. The use of value receivers is appropriate for these read-only operations.
Summary by CodeRabbit
Refactor
Bug Fixes
Chores