Skip to content

138 fracmanager consistency#139

Closed
eguguchkin wants to merge 2 commits into
mainfrom
138-fracmanager-consistency
Closed

138 fracmanager consistency#139
eguguchkin wants to merge 2 commits into
mainfrom
138-fracmanager-consistency

Conversation

@eguguchkin
Copy link
Copy Markdown
Collaborator

@eguguchkin eguguchkin commented Sep 18, 2025

Description

Core System Refactoring for Fraction Management Consistency and Performance.

Fixes #138


  • I have read and followed all requirements in CONTRIBUTING.md;
  • I used LLM/AI assistance to make this pull request;

Summary by CodeRabbit

  • New Features

    • Direct Fetch and Search on fractions (no more DataProvider).
    • New sealing pipeline with streaming, better compression controls, and preloading.
    • Lifecycle manager with simple startup/shutdown (New(...) returns stop()).
    • Persistent storage-state tracking for capacity limits.
    • ULID-based fraction IDs for orderly naming.
  • Refactor

    • Type consolidation into common/sealed packages; updated APIs (Fractions(), Oldest(), IsCapacityExceeded()).
    • Reworked loader, registry, and offloading workflow for stability and efficiency.
    • Simplified cache maintenance and removed legacy metrics/internals.
  • Documentation

    • Expanded token table documentation for clarity.

@github-actions
Copy link
Copy Markdown
Contributor

PR Title Validation Failed
Please refer to CONTRIBUTING.md

@eguguchkin eguguchkin closed this Sep 18, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 18, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Large refactor introducing a new sealing subsystem and restructuring fraction lifecycle management. Replaces frac.Info with common.Info, removes DataProvider in favor of direct Fetch/Search, adds ActiveSealingSource and sealed/sealing builders, overhauls fracmanager (registry, loader, tasks, state), adjusts store/gRPC entry points, and deletes legacy sealing/block writer utilities.

Changes

Cohort / File(s) Summary
Buffer pool
bytespool/writer.go
Added ReleaseWriter for centralized cleanup; FlushReleaseWriter delegates to it.
CLI refactors
cmd/distribution/main.go, cmd/index_analyzer/main.go, cmd/seq-db/seq-db.go
Switched to common/sealed types; updated cache APIs; added goroutine management in analyzer; expanded SealParams usage; removed ShouldReplay.
Constants
consts/consts.go
Removed IDsBlockSize; renamed SealOnExitFracSizePercent to MinSealPercent; reordered RegularBlockSize.
Fraction API overhaul
frac/fraction.go, frac/active.go, frac/remote.go, frac/sealed.go, frac/active_index.go, frac/sealed_index.go
Replaced DataProvider with Fetch/Search; migrated Info() to common.Info; adjusted internal provider creation; simplified Active/Remote/Sealed lifecycles.
Active docs positions
frac/active_docs_positions.go
DocsPositions now uses idToPos map and lidToPos slice; added GetSync; refactored SetMultiple logic.
Remove legacy sealing v1
frac/active_sealer.go, frac/active_sealer_test.go, frac/disk_blocks.go, frac/disk_blocks_producer.go, frac/disk_blocks_writer.go, frac/seal_stats.go
Deleted old sealing pipeline, related tests, and disk block helpers/writer/stats.
Sealing v2: common and sealed
frac/common/info.go, frac/common/seal_params.go, frac/sealed/block_info.go, frac/sealed/block_offsets.go, frac/sealed/preloaded_data.go
Introduced common.Info method signature using iter.Seq; added common.SealParams; moved sealed block types to sealed package; added PreloadedData and BlocksData.
Sealing v2: builders and sealer
frac/sealed/sealing/*
Added blocks builder, index sealer, Seal(Source, params) entrypoint, stats logger (internal), and tests.
Active sealing source
frac/active_sealing_source.go
Added ActiveSealingSource to prepare/sort and stream data for sealing; exposes sequences for tokens/fields/IDs/docs.
Token table docs
frac/sealed/token/table.go, .../table_entry.go, .../table_loader.go
Documentation improvements; added TableBlock type for on-disk representation.
FracManager: architecture
fracmanager/fracmanager.go, fracmanager/lifecycle_manager.go, fracmanager/fraction_provider.go, fracmanager/fraction_registry.go, fracmanager/loader.go, fracmanager/tasks.go, fracmanager/storage_state.go, fracmanager/frac_manifest.go
New lifecycle with registry, loader, task/state managers, ULID-based naming; provider now seals via sealing v2 and offloads; new StopFunc; public API adjusted (Writer, Fractions, Oldest, IsCapacityExceeded, Active, Append).
FracManager: cache and maintenance
fracmanager/frac_info_cache.go, fracmanager/cache_maintainer.go, fracmanager/cache_maintainer_metrics.go
Renamed/retyped frac info cache to common.Info; simplified RunCleanLoop; moved Prometheus cache metrics to dedicated file with accessors.
FracManager: indexer lifecycle
frac/active_indexer.go
NewActiveIndexer returns stop func; removed Stop; start() now returns cleanup closure.
Async/search/fetch paths
fracmanager/async_searcher.go, fracmanager/fetcher.go, fracmanager/searcher.go
Switched from DataProvider to direct Search/Fetch calls.
Storage removals and API tweak
storage/block_former.go, storage/blocks_stats.go, storage/blocks_writer.go, storage/index_block_header.go
Deleted former/writer/stats; NewIndexBlockHeader now takes explicit size/rawSize.
Store/gRPC wiring
storeapi/grpc_*.go, storeapi/store.go
Replaced GetAllFracs with Fractions; updated capacity check to IsCapacityExceeded; Oldest() usage; Store now uses New(...) and stop callback.
Tests and setup updates
fracmanager/*_test.go, fracmanager/sealer_test.go, fracmanager/fraction_provider_test.go, tests/setup/env.go
Updated to common.Info, new constructors/APIs, added ULID generator test, new sealing test path; removed ResetCache helpers; new WaitIdleForTests and SealForcedForTests.
Command-line distribution cache
cmd/distribution/main.go
Switched sealed frac cache to frac info cache; API renames (Get/Add).

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~180+ minutes

Possibly related PRs

Suggested labels

epic/sealing_v2, epic/s3-offloading

Suggested reviewers

  • forshev
  • dkharms
  • moflotas
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 138-fracmanager-consistency

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e717bc and 745f260.

📒 Files selected for processing (72)
  • bytespool/writer.go (1 hunks)
  • cmd/distribution/main.go (7 hunks)
  • cmd/index_analyzer/main.go (3 hunks)
  • cmd/seq-db/seq-db.go (2 hunks)
  • consts/consts.go (2 hunks)
  • frac/active.go (5 hunks)
  • frac/active_docs_positions.go (2 hunks)
  • frac/active_index.go (2 hunks)
  • frac/active_indexer.go (3 hunks)
  • frac/active_sealer.go (0 hunks)
  • frac/active_sealer_test.go (0 hunks)
  • frac/active_sealing_source.go (1 hunks)
  • frac/common/info.go (2 hunks)
  • frac/common/seal_params.go (1 hunks)
  • frac/disk_blocks.go (0 hunks)
  • frac/disk_blocks_producer.go (0 hunks)
  • frac/disk_blocks_writer.go (0 hunks)
  • frac/fraction.go (1 hunks)
  • frac/remote.go (7 hunks)
  • frac/seal_stats.go (0 hunks)
  • frac/sealed.go (10 hunks)
  • frac/sealed/block_info.go (2 hunks)
  • frac/sealed/block_offsets.go (1 hunks)
  • frac/sealed/preloaded_data.go (1 hunks)
  • frac/sealed/sealing/blocks_builder.go (1 hunks)
  • frac/sealed/sealing/blocks_builder_test.go (1 hunks)
  • frac/sealed/sealing/index.go (1 hunks)
  • frac/sealed/sealing/sealer.go (1 hunks)
  • frac/sealed/sealing/stats.go (1 hunks)
  • frac/sealed/token/table.go (1 hunks)
  • frac/sealed/token/table_entry.go (1 hunks)
  • frac/sealed/token/table_loader.go (1 hunks)
  • frac/sealed_index.go (2 hunks)
  • frac/sealed_loader.go (4 hunks)
  • fracmanager/async_searcher.go (1 hunks)
  • fracmanager/async_searcher_test.go (2 hunks)
  • fracmanager/cache_maintainer.go (1 hunks)
  • fracmanager/cache_maintainer_metrics.go (1 hunks)
  • fracmanager/config.go (2 hunks)
  • fracmanager/fetcher.go (1 hunks)
  • fracmanager/fetcher_test.go (3 hunks)
  • fracmanager/frac_info_cache.go (5 hunks)
  • fracmanager/frac_info_cache_test.go (15 hunks)
  • fracmanager/frac_manifest.go (1 hunks)
  • fracmanager/fracmanager.go (1 hunks)
  • fracmanager/fracmanager_for_tests.go (1 hunks)
  • fracmanager/fracmanager_test.go (4 hunks)
  • fracmanager/fracs_stats.go (1 hunks)
  • fracmanager/fraction_provider.go (2 hunks)
  • fracmanager/fraction_provider_test.go (1 hunks)
  • fracmanager/fraction_registry.go (1 hunks)
  • fracmanager/lifecycle_manager.go (1 hunks)
  • fracmanager/loader.go (1 hunks)
  • fracmanager/proxy_frac.go (1 hunks)
  • fracmanager/sealer_test.go (5 hunks)
  • fracmanager/searcher.go (1 hunks)
  • fracmanager/searcher_test.go (2 hunks)
  • fracmanager/storage_state.go (1 hunks)
  • fracmanager/tasks.go (1 hunks)
  • metric/store.go (0 hunks)
  • storage/block_former.go (0 hunks)
  • storage/blocks_stats.go (0 hunks)
  • storage/blocks_writer.go (0 hunks)
  • storage/index_block_header.go (1 hunks)
  • storeapi/grpc_async_search.go (1 hunks)
  • storeapi/grpc_fetch.go (1 hunks)
  • storeapi/grpc_search.go (3 hunks)
  • storeapi/grpc_status.go (1 hunks)
  • storeapi/grpc_v1.go (1 hunks)
  • storeapi/grpc_v1_test.go (2 hunks)
  • storeapi/store.go (5 hunks)
  • tests/setup/env.go (5 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

PR Title Validation Failed
Please refer to CONTRIBUTING.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Core System Refactoring for Fraction Management Consistency and Performance

1 participant